Abstract
Network of workstation (NOW) is a cost-effective alternative to massively parallel supercomputers. As commercially available off-the-shelf processors become cheaper and faster, it is now possible to build a cluster that provides high computing power within a limited budget. However, a cluster may consist of different types of processors and this heterogeneity complicates the design of efficient collective communication protocols. For example, it is a very hard combinatorial problem to find an optimal reduction schedule for such heterogeneous clusters. Nevertheless, we show that a simple technique called slowest-node-first (SNF) is very effective in designing efficient reduction protocols for heterogeneous clusters. First, we show that SNF is actually a 2-approximation algorithm, which means that an SNF schedule length is always within twice of the optimal schedule length, no matter what kind of cluster is given. In addition, we show that SNF does give the optimal reduction time when the cluster consists of two types of processors, when the ratio of communication speed between them is at least two. When the communication speed ratio is less than two, we develop a dynamic programming technique to find the optimal schedule. Our dynamic programming utilizes the monotone property of the objective function, and can significantly reduce the amount of computation time. Finally, combined with an approximation algorithm for broadcast 2004, we propose an all-reduction algorithm which sends the reduction answer to all processors, with approximation ratio 3.5.
We conduct three groups of experiments. First, we show that SNF performs better than the built-in MPI_Reduce in a test cluster. Second, we observe a factor of 93 times saving in computation time to find the optimal schedule, when compared with a naive dynamic programming implementation. Thirdly, we apply the theoretical results to a branch-and-bound search and show that they can reduce the search time of the optimal reduction schedule by a factor of 500, when the cluster has three kinds of processors.
Similar content being viewed by others
References
Anderson, T., Culler, D., Patterson, D.: A case for networks of workstations (now). In: IEEE Micro, February 1995, pp. 54–64 (1995)
Banikazemi, M., Moorthy, V., Panda, D.K.: Efficient collective communication on heterogeneous networks of workstations. In: Proceedings of International Parallel Processing Conference, pp. 460–467 (1998)
Banino, C., Beaumont, O., Carter, L., Ferrante, J., Legrand, A., Robert, Y.: Scheduling strategies for master–slave tasking on heterogeneous processor platforms. IEEE Trans. Parallel Distrib. Syst. 15(4), 319–330 (2004)
Bar-Noy, A., Guha, S., Naor, J., Schieber, B.: Multicast in heterogeneous networks. In: Proceedings of the 13th Annual ACM Symposium on Theory of Computing (1998)
Bar-Noy, A., Kipnis, S.: Designing broadcast algorithms in the postal model for message-passing systems. Math. Syst. Theory 27(5), 431–452 (1994)
Beaumont, O., Legrand, A., Marchal, L., Robert, Y.: Pipelining broadcasts on heterogeneous platforms. IEEE Trans. Parallel Distrib. Syst. 16(4), 300–313 (2005)
Beaumont, O., Marchal, L., Robert, Y.: Broadcast trees for heterogeneous platforms. In: 19th IEEE International Parallel and Distributed Processing Symposium, vol. 1, p. 80b. IEEE Computer Society, Los Alamitos (2005)
Bhat, P.B., Raghavendra, C.S., Prasanna, V.K.: Efficient collective communication in distributed heterogeneous systems. In: Proceedings of the International Conference on Distributed Computing Systems (1999)
Cui, A.Q., Street, R.L.: Large-eddy simulation of coastal upwelling flow. Environ. Fluid Mech. 4(2), 197–223 (2004)
den Burger, M., Kielmann, T., Bal, H.E.: Balanced multicasting: high-throughput communication for grid applications. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, p. 46. IEEE Computer Society, Washington (2005)
Dinneen, M., Fellows, M., Faber, V.: Algebraic construction of efficient networks. In: Applied Algebra, Algebraic Algorithms, and Error Correcting Codes. Lecture Notes in Computer Science, vol. 539, p. 9. Springer, Berlin (1991)
Dubinski, J., Kim, J., Park, C., Humble, R.: GOTPM: a parallel hybrid particle-mesh treecode. New Astronomy 9(2), 111–126 (2004)
Bruck, J. et al.: Efficient message passing interface (MPI) for parallel computing on clusters of workstations. J. Parallel Distrib. Comput. 40(1), 19–34 (1997)
Garey, M.R., Johnson, D.S.: Computer and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979)
Gargang, L., Vaccaro, U.: On the construction of minimal broadcast networks. Network 19(6), 673–689 (1989)
Grigni, M., Peleg, D.: Tight bounds on minimum broadcast networks. SIAM J. Discrete Math. 4, 207–222 (1991)
Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high-performance, portable implementation of the mpi: a message passing interface standard. Parallel Comput. 22(6), 789–828 (1996)
Hedetniemi, S.M., Hedetniem, S.T., Liestman, A.L.: A survey of gossiping and broadcasting in communication networks. Networks 18(4), 319–349 (1991)
Karonis, N., de Supinski, B., Foster, I., Gropp, W., Lusk, E., Bresnahan, J.: Exploiting hierarchy in parallel computer networks to optimize collective operation performance. In: Proceedings of the 14th International Parallel and Distributed Processing Symposium (2000)
Karp, R., Sahay, A., Santos, E., Schauser, K.E.: Optimal broadcast and summation in the LogP model. In: Proceedings of 5th Annual Symposium on Parallel Algorithms and Architectures (1993)
Kesavan, R., Bondalapati, K., Panda, D.: Multicast on irregular switch-based networks with wormhole routing. In: Proceedings of International Symposium on High Performance Computer Architecture (1997)
Khuller, S., Kim, Y.: On broadcasting in heterogeneous networks. In: Proceedings of the 16th Annual ACM Symposium on Parallel Architectures and Algorithms (2004)
Kielmann, T., Hofman, R.F.H., Bal, H.E., Plaat, A., Raoul, A., Bhoedjang, F.: Mpi’sa reduction operations in clustered wide area systems. In: Proceedings of the Message Passing Interface Developer’s and User’s Conference (1999)
Liestman, A.L., Peters, J.G.: Broadcast networks of bounded degree. SIAM J. Discrete Math. 1, 531–540 (1988)
Liu, P.: Broadcast scheduling optimization for heterogeneous cluster systems. J. Algorithms 42, 135–152 (2002)
Liu, P., Wang, D., Guo, Y.: An approximation algorithm for broadcast scheduling in heterogeneous cluster. In: The 9th International Conference on Realtime Computing Systems and Applications, Taiwan (2003)
Luecke, G.R., Kraeva, M., Yuan, J., Spanoyannis, S.: Performance and scalability of MPI on PC clusters. Concurr. Comput. Pract. Exp. 16(1), 79–107 (2004)
Mpich, O.: Improving the performance of collective operations in mpich. Improving the performance of collective. In: Proceedings of the 11th EuroPVM/MPI Conference (2003)
Rabenseifner, R.: Optimization of collective reduction operations. In: Proceedings of International Conference on Computational Science (2004)
Rabenseifner, R., Träff, J.L.: More efficient reduction algorithms for non-power-of-two number of processors in message-passing parallel systems. In: PVM/MPI, pp. 36–46 (2004)
Richards, D., Liestman, A.L.: Generalization of broadcast and gossiping. Networks 18(2), 125–138 (1988)
Steffenel, L.: A framework for adaptive collective communications on heterogeneous hierarchical networks. Research Report 6036, INRIA (2006)
Vadhiyar, S.S., Fagg, G.E., Dongarra, J.: Automatically tuned collective communications. In: Proceedings of the 2005 ACM/IEEE Conference on Supercomputing, p. 46. IEEE Computer Society, Los Alamitos (2000)
Ventura, J.A., Weng, X.: A new method for constructing minimal broadcast networks. Networks 23(5), 481–497 (1993)
West, D.B.: A class of solutions to the gossip problem. Discrete Math. 39(33), 307–326 (1992)
Yin, Z., Clercx, H.J.H., Montgomery, D.C.: An easily implemented task-based parallel scheme for the Fourier pseudospectral solver applied to 2D Navier–Stokes turbulence. Comput. Fluids 33(4), 509–520 (2004)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liu, P., Kuo, MC. & Wang, DW. An Approximation Algorithm and Dynamic Programming for Reduction in Heterogeneous Environments. Algorithmica 53, 425–453 (2009). https://doi.org/10.1007/s00453-007-9113-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-007-9113-7