ABSTRACT
Database replication is a mechanism to achieve scalability, for example, by executing queries independently on replica nodes. Partial replication is an approach to minimize the overall memory consumption of a replication cluster while still enabling a balanced load distribution among nodes to scale the query throughput linearly with the number of replicas. Partial replication reduces the cluster costs, speeds up data synchronization, and improves caching. However, load balancing may become skewed in the case of unexpected query distributions, unfavorable query timings, or node failures. To simulate and visualize the load balancing behavior for specific data fragment allocations, we implemented an interactive application. It allows users to retrace and evaluate the end-to-end performance of partially replicated database systems in varying experiments. Using our tool, we find that existing allocation approaches are either not memory-efficient or may result in load imbalances when nodes fail. We show that our novel robust allocation strategy achieves a better workload distribution with even less memory.
Supplemental Material
- [n.d.]. https://hyrise.github.io/replication/.Google Scholar
- Stefan Halfpap and Rainer Schlosser. 2019 a. A Comparison of Allocation Algorithms for Partially Replicated Databases. In ICDE. 2008--2011.Google Scholar
- Stefan Halfpap and Rainer Schlosser. 2019 b. Workload-Driven Fragment Allocation for Partially Replicated Databases Using Linear Programming. In ICDE. 1746--1749.Google Scholar
- Bettina Kemme and Gustavo Alonso. 2010. Database Replication: a Tale of Research across Communities. PVLDB, Vol. 3, 1 (2010), 5--12.Google ScholarDigital Library
- M. Tamer Ö zsu and Patrick Valduriez. 2020. Principles of Distributed Database Systems, 4th Edition. Springer.Google Scholar
- Hasso Plattner. 2009. A common database approach for OLTP and OLAP using an in-memory column database. In SIGMOD. 1--2.Google Scholar
- Tilmann Rabl and Hans-Arno Jacobsen. 2017. Query Centric Partitioning and Allocation for Partially Replicated Database Systems. In SIGMOD. 315--330.Google Scholar
Index Terms
- Exploration of Dynamic Query-Based Load Balancing for Partially Replicated Database Systems with Node Failures
Recommendations
Adaptive Load Balancing Dashboard in Dynamic Distributed Systems
Considering the dynamic nature of new generation scientific problems, load balancing is a necessity to manage the load in an efficient manner. Load balancing systems are used to optimize the resource consumption, maximize the throughput, minimize ...
Load Balancing in Quorum Systems
This paper introduces and studies the question of balancing the load on processors participating in a given quorum system. Our proposed measure for the degree of balancing is the ratio between the load on the least frequently referenced element and on ...
Multi-cluster load balancing based on process migration
APPT'07: Proceedings of the 7th international conference on Advanced parallel processing technologiesLoad balancing is important for distributed computing systems to achieve maximum resource utilization, and process migration is an efficient way to dynamically balance the load among multiple nodes. Due to limited capacity of a single cluster, it's ...
Comments