Skip to main content

Advertisement

Log in

A task migration mechanism for distributed many-core operating systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Spatial locality of task execution is becoming important in future hardware platforms since the number of cores is steadily increasing. The large amount of cores requires an intelligent power manager and the high chip and core density requires increased thermal awareness to avoid thermal hotspots on the chip. This paper presents a lightweight task migration mechanism explicitly for distributed operating systems running on many-core platforms. As the distributed OS runs one scheduler on each core, the tasks are migrated between OS kernels within the same shared memory platform. The benefits, such as performance and energy efficiency, of task migration are achieved by re-locating running tasks on the most appropriate cores and keeping the overhead of executing such a migration sufficiently low. We investigate the overhead of migrating tasks on a distributed OS running both on a bus-based platform and a many-core NoC—with these means of measures, we can predict the task migration overhead and pinpoint the emerging bottlenecks. With the presented task migration mechanism, we intend to improve the dynamism of power and performance characteristics in distributed many-core operating systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Cuesta D, Ayala J, Hidalgo J, Atienza D, Acquaviva A, Macii E (2010) Adaptive task migration policies for thermal control in mpsocs. In: Proceedings of the IEEE 2010 Annual Symposium on VLSI, vol 1. Ecole Polytechnique Fédérale de Lausanne and Politecnico di Torino

  2. Mulas F, Atienza D (2009) Thermal balancing policy for multiprocessor stream computing platforms. IEEE Trans Comput Aided Des Integr Circuits Syst 28:1870–1882

    Article  Google Scholar 

  3. Vaddina K, Rahmani A-M, Latif K, Liljeberg P, Plosila J (2011) Thermal analysis of job allocation and scheduling schemes for 3D stacked NoC’s. In: Proceedings of the Euromicro conference on digital system design, pp 643–648

  4. Musoll E (2010) Hardware-based load balancing for massive multicore architectures implementing power gating. IEEE Trans Comput Aided Des Integr Circuits Syst 29(3):493–497. doi:10.1109/TCAD.2009.2018863

    Article  Google Scholar 

  5. Matsumoto K, Ibaraki S, Sato M, Sakuma K, Orii Y, Yamada F (2010) Investigations of cooling solutions for three-dimensional (3d) chip stacks. In: 26th Annual IEEE semiconductor thermal measurement and management symposium, SEMI-THERM 2010, pp 25–32. doi:10.1109/STHERM.2010.5444319

  6. Rahmani A-M, Vaddina K, Latif K, Liljeberg P, Plosila J, Tenhunen H (2012) Design and management of high-performance, reliable and thermal-aware 3D networks-on-chip. IET Circuits Devices Syst 6(5):308–321

    Article  Google Scholar 

  7. Rahmani A-M, Vaddina K, Latif K, Liljeberg P, Plosila J, Tenhunen H (2012) Generic monitoring and management infrastructure for 3D NoC-Bus hybrid architectures. In: Proceedings of the IEEE/ACM international symposium on networks on chip, pp 177–184

  8. Vaddina K, Rahmani A-M, Latif K, Liljeberg P, Plosila J (2012) Thermal modeling and analysis of advanced 3D stacked structures. Procedia Eng 30:248–257

    Article  Google Scholar 

  9. Jones MT Inside the linux scheduler, developerWorks. URL http://www.ibm.com/developerworks/library/l-completely-fair-scheduler/

  10. Baumann A, Barham P (2009) The multikernel: a new os architecture for scalable multicore systems. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles, SOSP ’09. ACM, New York, pp 29–44

  11. Nightingale EB, Hodson O, McIlroy R, Hawblitzel C, Hunt G (2009) Helios: heterogeneous multiprocessing with satellite kernels. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles, SOSP ’09, ACM, New York, NY, USA, pp 221–234. doi:10.1145/1629575.1629597

  12. Boyd-Wickizer S, Chen H, Chen R, Mao Y, Kaashoek F, Morris R, Pesterev A, Stein L, Wu M, Dai Y, Zhang Y, Zhang Z (2008) Corey: an operating system for many cores. In: Proceedings of the 8th USENIX conference on Operating systems design and implementation, OSDI’08, USENIX Association, Berkeley, CA, USA, pp 43–57

  13. Engin TJE Bag distributed real-time operating system and task migration. Turkish J Elect Eng Comput Sci 9 (2)

  14. Saraswat PK, Pop P, Madsen J (2009) Task migration for fault-tolerance in mixed-criticality embedded systems. SIGBED Rev 6(3):6:1–6:5. doi:10.1145/1851340.1851348

    Article  Google Scholar 

  15. Bertozzi S, Acquaviva A, Bertozzi D, Poggiali A (2006) Supporting task migration in multi-processor systems-on-chip: a feasibility study. In: Proceedings of the conference on design, automation and test in Europe: Proceedings, 3001 Leuven, Belgium, pp 15–20

  16. Armstrong JB (1995) Dynamic task migration from simd to spmd virtual machines. In: Proceedings of the 1st international conference on engineering of complex computer systems, ICECCS ’95. IEEE Computer Society, Washington, DC, p 326

  17. DeVuyst M, Venkat A, Tullsen DM (2012) Execution migration in a heterogeneous-isa chip multiprocessor. In: 17th International conference on architectural support for programming languages and operating systems (ASPLOS 2012). IEEE Computer Society, New York

  18. Aguiar A, Filho SJ, dos Santos TG, Marcon C, Hessel F (2008) Architectural support for task migration conserning mpsoc. SBC

  19. Acquaviva A, Alimonda A, Carta S, Pittau M Assessing task migration impact on embedded soft real-time streaming multimedia applications, EURASIP J Embed Syst (9)

  20. Layouni LGS, Benkhelifa M, Verdier F, Chauvet S (2009) Multiprocessor task migration implementation in a reconfigurable platform. In: International conference on reconfigurable computing and FPGAs, 2009. doi:10.1109/ReConFig.37

  21. Brio E, Barcelos D, Wagner F (2008) Dynamic task allocation strategies in mpsoc for soft real-time applications. In: Proceedings of the conference on design, automation and test in Europe. IEEE Council on Electronic Design Automation and EDAA : European Design Automation Association, ACM, New York, pp 1386–1389

  22. Smith P, Hutchinson NC (1998) Heterogeneous process migration: the tui system. Softw Pract Exp 28(6):611–639

    Article  Google Scholar 

  23. Chen T-S (2000) Task migration in 2D wormhole-routed mesh multicomputers. Inf Process Lett 73(3–4):103–110

    Article  Google Scholar 

  24. Goh L, Veeravalli B (2008) Design and performance evaluation of combined first-flit task allocation and migration strategies in mesh multicomputer systems. Parallel Comput, pp 508–520

  25. Goodarzi B, Sarbazi-Azad H (2011) Task migration in mesh NoCs over virtual point-to-point connections. In: Proceedings of the Euromicro international conference on parallel, distributed and network-based processing, pp 463–469

  26. Almeida G, Varyani S, Busseuil R, Sassatelli G, Benoit P, Torres L, Carara E, Moraes F (2010) Evaluating the impact of task migration in multi-processor systems-on-chip. In: Proceedings of the symposium on Integrated circuits and system design, pp 73–78

  27. Shao YS, Brooks D (2013) Energy characterization and instruction-level energy model of intel’s xeon phi processor. In: 2013 IEEE international symposium on low power electronics and design (ISLPED), pp 389–394. doi:10.1109/ISLPED.2013.6629328

  28. Potluri S, Tomko K, Bureddy D, Panda DK Intra-mic mpi communication using mvapich2: Early experience. Texas Advanced Computing Center (TACC)-Intel Highly Parallel Computing Symposium

  29. Howard J, Dighe S, Hoskote Y, Vangal S (2010) A 48-core ia-32 message-passing processor with dvfs in 45nm cmos. In: Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp 108–109. doi:10.1109/ISSCC.2010.5434077

  30. Wentzlaff D, Griffin P, Hoffmann H, Bao L, Edwards B, Ramey C, Mattina M, Miao C-C, JFB III, Agarwal A (2007) On-chip interconnection architecture of the tile processor. IEEE Micro 27:15–31. doi:10.1109/MM.2007.89

    Google Scholar 

  31. Boyd-Wickizer S, Clements AT, Mao Y, Pesterev A, Kaashoek MF, Morris R, Zeldovich N (2010) An analysis of linux scalability to many cores, in: Proceedings of the 9th USENIX conference on Operating systems design and implementation, OSDI’10, USENIX Association, Berkeley, CA, USA, pp 1–8

  32. Kleen A (2009) Linux multi-core scaleability, in: Linux Kongress 2009, Dresden

  33. Boyd-Wickizer S, Chen H, Chen R, Mao Y, Kaashoek F, Morris R, Pesterev A, Stein L, Wu M, Dai Y, Zhang Y, Zhang Z (2008) Corey: an operating system for many cores. In: Proceedings of the 8th USENIX conference on Operating systems design and implementation, OSDI’08, USENIX Association, Berkeley, CA, USA, 2008, pp 43–57. http://portal.acm.org/citation.cfm?id=1855741.1855745

  34. Wentzlaff D, Agarwal A (2009) Factored operating systems (fos): the case for a scalable operating system for multicores. SIGOPS Oper Syst Rev 43:76–85

    Article  Google Scholar 

  35. ARM, Coretile express a9x4 technical reference manual, http://infocenter.arm.com/help/topic/com.arm.doc.dui0448e/DUI0448E_coretile_express_a9x4_trm.pdf (2011)

  36. ARM, Cortex a9 technical reference manual, http://infocenter.arm.com/help/topic/com.arm.doc.ddi0388e/DDI0388E_cortex_a9_r2p0_trm.pdf (2009)

  37. Barry R (2009) FreeRTOS Reference Manual: API functions and Configuration Options. Real Time Engineers Ltd

  38. Ågren D (2012) Freertos cortex-a9 mpcore port. https://github.com/ESLab/FreeRTOS---ARM-Cortex-A9-VersatileExpress-Quad-Core-port

  39. Fazzino F, Palesi M, Patti D Noxim: Network-on-chip simulator, URL: http://sourceforge.net/projects/noxim

  40. Banno F, Marletta D, Pappalardo G, Tramontana E (2010) Tackling consistency issues for runtime updating distributed systems. In: 2010 IEEE international symposium on parallel distributed processing, workshops and Phd forum (IPDPSW), pp 1–8. doi:10.1109/IPDPSW.2010.5470863

  41. Bhadauria M, Weaver VM, McKee SA (2009) Understanding PARSEC performance on contemporary CMPs. In: Proceedings of the 2009 IEEE international symposium on workload characterization (IISWC), Washington, DC, USA, pp 98–107

  42. Das R, Ausavarungnirun R, Mutlu O, Kumar A, Azimi M (2013) Application-to-core mapping policies to reduce memory system interference in multi-core systems. In: 2013 IEEE 19th international symposium on high performance computer architecture (HPCA2013), pp 107–118. doi:10.1109/HPCA.2013.6522311

  43. Slotte R (2012) A lightweight rich-component framework for real-time embedded systems, Master’s thesis, Åbo Akademi University

Download references

Acknowledgments

This work has been supported by the Artemis JU project RECOMP: Reduced Certification Costs Using Trusted Multi-core Platforms (Grant Agreement Number 100202). The present work benefited from the input of William Davy, Wittenstein ltd., who provided valuable software and integration efforts to the research summarized here.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Holmbacka.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Holmbacka, S., Fattah, M., Lund, W. et al. A task migration mechanism for distributed many-core operating systems. J Supercomput 68, 1141–1162 (2014). https://doi.org/10.1007/s11227-014-1144-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-014-1144-7

Keywords

Navigation