A Job Scheduling Approach for Multi-core Clusters Based on Virtual Malleability

Utrera, Gladys; Tabik, Siham; Corbalan, Julita; Labarta, Jesús

doi:10.1007/978-3-642-32820-6_20

A Job Scheduling Approach for Multi-core Clusters Based on Virtual Malleability

Gladys Utrera¹⁹,
Siham Tabik²⁰,
Julita Corbalan¹⁹ &
…
Jesús Labarta²¹

Conference paper

3089 Accesses
13 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7484))

Abstract

Many commercial job scheduling strategies in multi processing systems tend to minimize waiting times of short jobs. However, long jobs cannot be left aside as their impact on the performance of the system is also determinant. In this work we propose a job scheduling strategy that maximizes resources utilization and improves the overall performance by allowing jobs to adapt to variations in the load. The experimental evaluations include both simulations and executions of real workloads. The results show that our strategy provides significant improvements over the traditional EASY backfilling policy, especially in medium to high machine loads.

Download to read the full chapter text

Chapter PDF

References

Marenostrum, http://www.bsc.es/marenostrum-support-services
MPI library, http://www.mcs.anl.gov/research/projects/mpi/
NAS Parallel Benchmarks, http://www.nas.nasa.gov/Resources/Software/npb.html
Parallel workload archive, http://www.cs.huji.ac.il/labs/parallel/workload/
Top500 supercomputers sites, http://www.top500.org/
Arpaci-Dusseau, A.C.: Implicit coscheduling: coordinated scheduling with implicit information in distributed systems. ACM Trans. Comput. Syst. 19, 283–331 (2001)
Article Google Scholar
Buisson, J., Sonmez, O., Mohamed, H., Lammers, W., Epema, D.: Scheduling malleable applications in multicluster systems. In: Proc. of the IEEE International Conference on Cluster Computing 2007, pp. 372–381 (2007)
Google Scholar
Cera, M.C., Georgiou, Y., Richard, O., Maillard, N., Navaux, P.O.A.: Supporting Malleability in Parallel Architectures with Dynamic CPUSETs Mapping and Dynamic MPI. In: Kant, K., Pemmaraju, S.V., Sivalingam, K.M., Wu, J. (eds.) ICDCN 2010. LNCS, vol. 5935, pp. 242–257. Springer, Heidelberg (2010)
Chapter Google Scholar
Cirne, W., Berman, F.: Using moldability to improve the performance of supercomputer jobs. J. Parallel Distrib. Comput. 62, 1571–1601 (2002)
MATH Google Scholar
Downey, A.B.: A model for speedup of parallel programs. Technical report, University of California at Berkerley (1997)
Google Scholar
El Maghraoui, K., Desell, T.J., Szymanski, B.K., Varela, C.A.: Dynamic malleability in iterative MPI applications. In: Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid, CCGRID 2007, pp. 591–598. IEEE Computer Society, Washington, DC (2007)
Chapter Google Scholar
Ernemann, C., Krogmann, M., Lepping, J., Yahyapour, R.: Scheduling on the Top 50 Machines. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 17–46. Springer, Heidelberg (2005)
Chapter Google Scholar
Feitelson, D.G., Rudolph, L.: Gang scheduling performance benefits for fine-grain synchronization. Journal of Parallel and Distributed Computing 16(4), 306–318 (1992)
Article MATH Google Scholar
Feitelson, D.G., Rudolph, L.: Toward Convergence in Job Schedulers for Parallel Supercomputers. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 1–26. Springer, Heidelberg (1996)
Chapter Google Scholar
Iancu, C., Hofmeyr, S., Zheng, Y., Blagojevic, F.: Oversubscription on multicore processors. In: 24th International Parallel and Distributed Processing Symposium (IPDPS), pp. 1–11 (2010)
Google Scholar
Lifka, D.A.: The ANL/IBM SP Scheduling System. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)
Chapter Google Scholar
Lublin, U., Feitelson, D.G.: The workload on parallel supercomputers: Modeling the characteristics of rigid jobs. Journal of Parallel and Distributed Computing 63, 2003 (2001)
Google Scholar
McCann, C., Zahorjan, J.: Processor allocation policies for message-passing parallel computers. In: Proceedings of the 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, SIGMETRICS 1994, pp. 19–32. ACM, New York (1994)
Chapter Google Scholar
Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the ibm sp2 with backfilling. IEEE Transactions on Parallel and Distributed Systems 12(6), 529–543 (2001)
Article Google Scholar
Padhye, J., Dowdy, L.W.: Dynamic Versus Adaptive Processor Allocation Policies for Message Passing Parallel Computers: An Empirical Comparison. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 224–243. Springer, Heidelberg (1996)
Chapter Google Scholar
Sodan, A.C., Jin, W.: Backfilling with fairness and slack for parallel job scheduling. Journal of Physics: Conference Series 256(1), 012–023 (2010)
Google Scholar
Subotic, V., Labarta, J., Valero, M.: Simulation environment for studying overlap of communication and computation. In: 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), White Plains, NY, pp. 115–116 (March 2010)
Google Scholar
Sudarsan, R., Ribbens, C.J.: Scheduling resizable parallel applications. In: International Parallel and Distributed Processing Symposium, pp. 1–10 (2009)
Google Scholar
Utrera, G., Corbalán, J., Labarta, J.: Implementing malleability on MPI jobs. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, PACT 2004, pp. 215–224. IEEE Computer Society, Washington, DC (2004)
Chapter Google Scholar
Utrera, G., Corbalán, J., Labarta, J.: Scheduling of MPI Applications: Self-co-scheduling. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 238–245. Springer, Heidelberg (2004)
Chapter Google Scholar
Utrera, G., Tabik, S., Corbalán, J., Labarta, J.: A job scheduling approach to reduce waiting times. Technical report, Technical University of Catalonia, UPC-DAC-RR-2012-1 (October 2011)
Google Scholar
Wiseman, Y., Feitelson, D.G.: Paired gang scheduling. IEEE Transactions on Parallel and Distributed Systems 14(6), 581–592 (2003)
Article Google Scholar
Zhang, Y., Sivasubramaniam, A., Moreira, J., Franke, H.: A simulation-based study of scheduling mechanisms for a dynamic cluster environment. In: Proceedings of the 14th International Conference on Supercomputing, ICS 2000, pp. 100–109. ACM, New York (2000)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Technical University of Catalonia (UPC), 08034, Barcelona, Spain
Gladys Utrera & Julita Corbalan
University of Malaga, 29071, Malaga, Spain
Siham Tabik
Barcelona Supercomputing Center (BSC), 08034, Barcelona, Spain
Jesús Labarta

Authors

Gladys Utrera
View author publications
You can also search for this author in PubMed Google Scholar
Siham Tabik
View author publications
You can also search for this author in PubMed Google Scholar
Julita Corbalan
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Labarta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Patras, Computer Technology Institute and Press “Diophantus”,, N. Kazantzaki, 26504, Rio, Greece
Christos Kaklamanis
University of Patras, University Building B, 26504, Rio, Greece
Theodore Papatheodorou
Computer Technology Institute and Press “Diophantus”, University of Patras, N. Kazantzaki, 26504, Rio, Greece
Paul G. Spirakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Utrera, G., Tabik, S., Corbalan, J., Labarta, J. (2012). A Job Scheduling Approach for Multi-core Clusters Based on Virtual Malleability. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds) Euro-Par 2012 Parallel Processing. Euro-Par 2012. Lecture Notes in Computer Science, vol 7484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32820-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-32820-6_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32819-0
Online ISBN: 978-3-642-32820-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics