Abstract
Spatial locality of task execution is becoming important in future hardware platforms since the number of cores is steadily increasing. The large amount of cores requires an intelligent power manager and the high chip and core density requires increased thermal awareness to avoid thermal hotspots on the chip. This paper presents a lightweight task migration mechanism explicitly for distributed operating systems running on many-core platforms. As the distributed OS runs one scheduler on each core, the tasks are migrated between OS kernels within the same shared memory platform. The benefits, such as performance and energy efficiency, of task migration are achieved by re-locating running tasks on the most appropriate cores and keeping the overhead of executing such a migration sufficiently low. We investigate the overhead of migrating tasks on a distributed OS running both on a bus-based platform and a many-core NoC—with these means of measures, we can predict the task migration overhead and pinpoint the emerging bottlenecks. With the presented task migration mechanism, we intend to improve the dynamism of power and performance characteristics in distributed many-core operating systems.
Similar content being viewed by others
References
Cuesta D, Ayala J, Hidalgo J, Atienza D, Acquaviva A, Macii E (2010) Adaptive task migration policies for thermal control in mpsocs. In: Proceedings of the IEEE 2010 Annual Symposium on VLSI, vol 1. Ecole Polytechnique Fédérale de Lausanne and Politecnico di Torino
Mulas F, Atienza D (2009) Thermal balancing policy for multiprocessor stream computing platforms. IEEE Trans Comput Aided Des Integr Circuits Syst 28:1870–1882
Vaddina K, Rahmani A-M, Latif K, Liljeberg P, Plosila J (2011) Thermal analysis of job allocation and scheduling schemes for 3D stacked NoC’s. In: Proceedings of the Euromicro conference on digital system design, pp 643–648
Musoll E (2010) Hardware-based load balancing for massive multicore architectures implementing power gating. IEEE Trans Comput Aided Des Integr Circuits Syst 29(3):493–497. doi:10.1109/TCAD.2009.2018863
Matsumoto K, Ibaraki S, Sato M, Sakuma K, Orii Y, Yamada F (2010) Investigations of cooling solutions for three-dimensional (3d) chip stacks. In: 26th Annual IEEE semiconductor thermal measurement and management symposium, SEMI-THERM 2010, pp 25–32. doi:10.1109/STHERM.2010.5444319
Rahmani A-M, Vaddina K, Latif K, Liljeberg P, Plosila J, Tenhunen H (2012) Design and management of high-performance, reliable and thermal-aware 3D networks-on-chip. IET Circuits Devices Syst 6(5):308–321
Rahmani A-M, Vaddina K, Latif K, Liljeberg P, Plosila J, Tenhunen H (2012) Generic monitoring and management infrastructure for 3D NoC-Bus hybrid architectures. In: Proceedings of the IEEE/ACM international symposium on networks on chip, pp 177–184
Vaddina K, Rahmani A-M, Latif K, Liljeberg P, Plosila J (2012) Thermal modeling and analysis of advanced 3D stacked structures. Procedia Eng 30:248–257
Jones MT Inside the linux scheduler, developerWorks. URL http://www.ibm.com/developerworks/library/l-completely-fair-scheduler/
Baumann A, Barham P (2009) The multikernel: a new os architecture for scalable multicore systems. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles, SOSP ’09. ACM, New York, pp 29–44
Nightingale EB, Hodson O, McIlroy R, Hawblitzel C, Hunt G (2009) Helios: heterogeneous multiprocessing with satellite kernels. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles, SOSP ’09, ACM, New York, NY, USA, pp 221–234. doi:10.1145/1629575.1629597
Boyd-Wickizer S, Chen H, Chen R, Mao Y, Kaashoek F, Morris R, Pesterev A, Stein L, Wu M, Dai Y, Zhang Y, Zhang Z (2008) Corey: an operating system for many cores. In: Proceedings of the 8th USENIX conference on Operating systems design and implementation, OSDI’08, USENIX Association, Berkeley, CA, USA, pp 43–57
Engin TJE Bag distributed real-time operating system and task migration. Turkish J Elect Eng Comput Sci 9 (2)
Saraswat PK, Pop P, Madsen J (2009) Task migration for fault-tolerance in mixed-criticality embedded systems. SIGBED Rev 6(3):6:1–6:5. doi:10.1145/1851340.1851348
Bertozzi S, Acquaviva A, Bertozzi D, Poggiali A (2006) Supporting task migration in multi-processor systems-on-chip: a feasibility study. In: Proceedings of the conference on design, automation and test in Europe: Proceedings, 3001 Leuven, Belgium, pp 15–20
Armstrong JB (1995) Dynamic task migration from simd to spmd virtual machines. In: Proceedings of the 1st international conference on engineering of complex computer systems, ICECCS ’95. IEEE Computer Society, Washington, DC, p 326
DeVuyst M, Venkat A, Tullsen DM (2012) Execution migration in a heterogeneous-isa chip multiprocessor. In: 17th International conference on architectural support for programming languages and operating systems (ASPLOS 2012). IEEE Computer Society, New York
Aguiar A, Filho SJ, dos Santos TG, Marcon C, Hessel F (2008) Architectural support for task migration conserning mpsoc. SBC
Acquaviva A, Alimonda A, Carta S, Pittau M Assessing task migration impact on embedded soft real-time streaming multimedia applications, EURASIP J Embed Syst (9)
Layouni LGS, Benkhelifa M, Verdier F, Chauvet S (2009) Multiprocessor task migration implementation in a reconfigurable platform. In: International conference on reconfigurable computing and FPGAs, 2009. doi:10.1109/ReConFig.37
Brio E, Barcelos D, Wagner F (2008) Dynamic task allocation strategies in mpsoc for soft real-time applications. In: Proceedings of the conference on design, automation and test in Europe. IEEE Council on Electronic Design Automation and EDAA : European Design Automation Association, ACM, New York, pp 1386–1389
Smith P, Hutchinson NC (1998) Heterogeneous process migration: the tui system. Softw Pract Exp 28(6):611–639
Chen T-S (2000) Task migration in 2D wormhole-routed mesh multicomputers. Inf Process Lett 73(3–4):103–110
Goh L, Veeravalli B (2008) Design and performance evaluation of combined first-flit task allocation and migration strategies in mesh multicomputer systems. Parallel Comput, pp 508–520
Goodarzi B, Sarbazi-Azad H (2011) Task migration in mesh NoCs over virtual point-to-point connections. In: Proceedings of the Euromicro international conference on parallel, distributed and network-based processing, pp 463–469
Almeida G, Varyani S, Busseuil R, Sassatelli G, Benoit P, Torres L, Carara E, Moraes F (2010) Evaluating the impact of task migration in multi-processor systems-on-chip. In: Proceedings of the symposium on Integrated circuits and system design, pp 73–78
Shao YS, Brooks D (2013) Energy characterization and instruction-level energy model of intel’s xeon phi processor. In: 2013 IEEE international symposium on low power electronics and design (ISLPED), pp 389–394. doi:10.1109/ISLPED.2013.6629328
Potluri S, Tomko K, Bureddy D, Panda DK Intra-mic mpi communication using mvapich2: Early experience. Texas Advanced Computing Center (TACC)-Intel Highly Parallel Computing Symposium
Howard J, Dighe S, Hoskote Y, Vangal S (2010) A 48-core ia-32 message-passing processor with dvfs in 45nm cmos. In: Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp 108–109. doi:10.1109/ISSCC.2010.5434077
Wentzlaff D, Griffin P, Hoffmann H, Bao L, Edwards B, Ramey C, Mattina M, Miao C-C, JFB III, Agarwal A (2007) On-chip interconnection architecture of the tile processor. IEEE Micro 27:15–31. doi:10.1109/MM.2007.89
Boyd-Wickizer S, Clements AT, Mao Y, Pesterev A, Kaashoek MF, Morris R, Zeldovich N (2010) An analysis of linux scalability to many cores, in: Proceedings of the 9th USENIX conference on Operating systems design and implementation, OSDI’10, USENIX Association, Berkeley, CA, USA, pp 1–8
Kleen A (2009) Linux multi-core scaleability, in: Linux Kongress 2009, Dresden
Boyd-Wickizer S, Chen H, Chen R, Mao Y, Kaashoek F, Morris R, Pesterev A, Stein L, Wu M, Dai Y, Zhang Y, Zhang Z (2008) Corey: an operating system for many cores. In: Proceedings of the 8th USENIX conference on Operating systems design and implementation, OSDI’08, USENIX Association, Berkeley, CA, USA, 2008, pp 43–57. http://portal.acm.org/citation.cfm?id=1855741.1855745
Wentzlaff D, Agarwal A (2009) Factored operating systems (fos): the case for a scalable operating system for multicores. SIGOPS Oper Syst Rev 43:76–85
ARM, Coretile express a9x4 technical reference manual, http://infocenter.arm.com/help/topic/com.arm.doc.dui0448e/DUI0448E_coretile_express_a9x4_trm.pdf (2011)
ARM, Cortex a9 technical reference manual, http://infocenter.arm.com/help/topic/com.arm.doc.ddi0388e/DDI0388E_cortex_a9_r2p0_trm.pdf (2009)
Barry R (2009) FreeRTOS Reference Manual: API functions and Configuration Options. Real Time Engineers Ltd
Ågren D (2012) Freertos cortex-a9 mpcore port. https://github.com/ESLab/FreeRTOS---ARM-Cortex-A9-VersatileExpress-Quad-Core-port
Fazzino F, Palesi M, Patti D Noxim: Network-on-chip simulator, URL: http://sourceforge.net/projects/noxim
Banno F, Marletta D, Pappalardo G, Tramontana E (2010) Tackling consistency issues for runtime updating distributed systems. In: 2010 IEEE international symposium on parallel distributed processing, workshops and Phd forum (IPDPSW), pp 1–8. doi:10.1109/IPDPSW.2010.5470863
Bhadauria M, Weaver VM, McKee SA (2009) Understanding PARSEC performance on contemporary CMPs. In: Proceedings of the 2009 IEEE international symposium on workload characterization (IISWC), Washington, DC, USA, pp 98–107
Das R, Ausavarungnirun R, Mutlu O, Kumar A, Azimi M (2013) Application-to-core mapping policies to reduce memory system interference in multi-core systems. In: 2013 IEEE 19th international symposium on high performance computer architecture (HPCA2013), pp 107–118. doi:10.1109/HPCA.2013.6522311
Slotte R (2012) A lightweight rich-component framework for real-time embedded systems, Master’s thesis, Åbo Akademi University
Acknowledgments
This work has been supported by the Artemis JU project RECOMP: Reduced Certification Costs Using Trusted Multi-core Platforms (Grant Agreement Number 100202). The present work benefited from the input of William Davy, Wittenstein ltd., who provided valuable software and integration efforts to the research summarized here.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Holmbacka, S., Fattah, M., Lund, W. et al. A task migration mechanism for distributed many-core operating systems. J Supercomput 68, 1141–1162 (2014). https://doi.org/10.1007/s11227-014-1144-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-014-1144-7