doi:10.1016/j.peva.2008.01.001
Published by Elsevier B.V.
Effective load balancing for cluster-based servers employing job preemption
References and further reading may be available for this article. To view references and further reading you must
purchase this article.
Victoria Ungureanua,
, Benjamin Melamedb,
,
and Michael Katehakisc, 
aDIMACS Center, Rutgers University, 96 Frelinghuysen Road, Piscataway, NJ 08854, United States
bDepartment of MSIS, Rutgers University, 94 Rockafeller Road, Piscataway, NJ 08854, United States
cDepartment of MSIS, Rutgers University, 180 University Ave., Newark, NJ 07102, United States
Received 1 September 2005;
revised 29 May 2007;
accepted 13 January 2008.
Available online 25 January 2008.
Abstract
A cluster-based server consists of a front-end dispatcher and multiple back-end servers. The dispatcher receives incoming jobs, and then decides how to assign them to back-end servers, which in turn serve the jobs according to some discipline. Cluster-based servers have been widely deployed, as they combine good performance with low costs.
Several assignment policies have been proposed for cluster-based servers, most of which aim to balance the load among back-end servers. There are two main strategies for load balancing: The first aims to balance the amount of workload at back-end servers, while the second aims to balance the number of jobs assigned to back-end servers. Examples of policies using these strategies are Dynamic and LC (Least Connected), respectively.
In this paper we propose a policy, called LC*, which combines the two aforementioned strategies. The paper shows experimentally that when preemption is admitted (i.e., when jobs execute concurrently on back-end servers), LC* substantially outperforms bothDynamic and LC in terms of response-time metrics. This improved performance is achieved by using only information readily available to the dispatcher, rendering LC* a practical policy to implement. Finally, we study a refinement, called ALC* (Adaptive LC*), which further improves on the response-time performance of LC* by adapting its actions to incoming traffic rates.
Keywords: Cluster-based servers; Back-end server architecture; Job preemption; Simulation
Fig. 1. Architecture of a cluster-based server.
(a) Number of request arrivals per second.
(b) Total bytes requested per second.
Fig. 2. Empirical request time series from a World Cup trace.
Fig. 3. Successive average slowdowns of Dynamic, LC and Size-Range on a cluster with four back-end servers, when jobs are not preempted.
Fig. 4. Server queues under (a) Dynamic and Size-Range, and (b) under LC.
Fig. 5. Successive average slowdowns for Dynamic, LC and Size-Range on a cluster with four back-end servers, when jobs are preempted (note the logarithmic scale on the vertical axis).
Fig. 6. Successive average slowdowns for LC and LC* on a cluster with four back-end servers.
Fig. 7. Successive average slowdowns for LC, ALC* and LC* on a cluster with four back-end servers.
Fig. 8. Successive average slowdowns for LC and ALC* on a two-way cluster employing delayed binding.
Fig. 9. (a) Back-end server processing when cache misses occur. (b) Successive average slowdowns of Dynamic, LC and ALC* on a cluster with four back-end servers, for a cache hit ratio of 99% (note the logarithmic scale on the vertical axis).
Fig. 10. Successive average slowdowns for Dynamic, LC and ALC* on a cluster with four back-end servers, when 0.02% of requests are dynamic and the cache hit ratio is 99% (note the logarithmic scale on the vertical axis).
Fig. 11. Successive average slowdowns for Dynamic, LC and ALC* on a cluster of eight back-end servers, when 10% of requests are dynamic and the cache hit ratio is 99%.
Fig. 12. Successive average slowdowns for LC, ALC* and LC* on a cluster of six back-end servers, when 10% of requests are dynamic and the cache hit ratio is 99%.
(a) Number of request arrivals per second.
(b) Total bytes requested per second.
Fig. 13. Empirical time series of the synthetic trace.
Fig. 14. Successive average slowdowns of Dynamic, LC and ALC* in a cluster of eight back-end servers (note the logarithmic scale on the vertical axis).
Table 1.
Comparison of statistics for Dynamic and LC

Table 2.
Comparative statistics for LC and LC*

A preliminary version of this paper appeared in the Proceedings of the IEEE International Symposium on Network Computing and Applications (IEEE NCA04).

Corresponding author. Tel.: +1 732 445 3128; fax: +1 732 445 6329.