ABSTRACT
As systems sizes increase to exascale and beyond, there is a need to enhance the system software to meet the needs and challenges of applications. The evolutionary versus revolutionary debate can be set aside by providing system software that simultaneously supports existing and new programming models. The seemingly contradictory requirements of scalable performance and traditional rich programming APIs (POSIX, and Linux in particular) suggest that approach, and has lead to a new class of research. Traditionally, operating systems for extreme-scale computing have followed two approaches: they have either started with a full-weight kernel (FWK), typically Linux, and removed features which were impeding performance and scalability, or they started with a light-weight kernel (LWK), and added capability to provide Linux compatibility. Neither of these approaches, succeed in retaining full Linux compatibility and achieving high scalability.
To overcome this problem, we have been exploring the design space of providing LWK performance while retaining the Linux APIs and Linux environment. Our hybrid solution is to run Linux and an LWK side-by-side on the same node. HPC applications execute on top of the LWK, but the system selectively provides OS features by leveraging the Linux kernel. In this paper, we discuss two possible methods of achieving the symbiosis between the two kernels and the trade-offs between them. Specifically, we detail and contrast two particular approaches, Intel's mOS project and IHK/McKernel, an effort lead by RIKEN Advanced Institute for Computational Science.
- Argo: An Exascale Operating System (Accessed: Jan, 2015). http://www.mcs.anl.gov/project/argo-exascale-operating-system.Google Scholar
- Kitten: A Lightweight Operating System for Ultrascale Supercomputers (Accessed: Jan, 2015). https://software.sandia.gov/trac/kitten.Google Scholar
- Baumann, A., Barham, P., Dagand, P.-E., Harris, T., Isaacs, R., Peter, S., Roscoe, T., Schüpbach, A., and Singhania, A. The multikernel: a new OS architecture for scalable multicore systems. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles (2009), SOSP '09, pp. 29--44. Google ScholarDigital Library
- Brightwell, R., Oldfield, R., Maccabe, A. B., and Bernholdt, D. E. Hobbes: Composition and Virtualization As the Foundations of an Extreme-scale OS/R. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers (2013), ROSS '13, pp. 2:1--2:8. Google ScholarDigital Library
- Duell, J. The design and implementation of Berkeley Lab Linux Checkpoint/restart. Technical report, Lawrence Berkeley National Laboratory, 2000.Google Scholar
- Gerofi, B., Shimada, A., Hori, A., and Ishikawa, Y. Partially Separated Page Tables for Efficient Operating System Assisted Hierarchical Memory Management on Heterogeneous Architectures. In Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on (may 2013).Google ScholarDigital Library
- Gerofi, B., Shimada, A., Hori, A., Masamichi, T., and Ishikawa, Y. CMCP: A Novel Page Replacement Policy for System Level Hierarchical Memory Management on Many-cores. In Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing (New York, NY, USA, 2014), HPDC '14, ACM, pp. 73--84. Google ScholarDigital Library
- Giampapa, M., Gooding, T., Inglett, T., and Wisniewski, R. W. Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (2010), SC '10, pp. 1--10. Google ScholarDigital Library
- Kelly, S. M., and Brightwell, R. Software architecture of the light weight kernel, Catamount. In In Cray User Group (2005), pp. 16--19.Google Scholar
- Krieger, O., Auslander, M., Rosenburg, B., Wisniewski, R. W., Xenidis, J., Da Silva, D., Ostrowski, M., Appavoo, J., Butrico, M., Mergen, M., Waterland, A., and Uhlig, V. K42: Building a Complete Operating System. SIGOPS Oper. Syst. Rev. 40, 4 (Apr. 2006), 133--145. Google ScholarDigital Library
- Lange, J., Pedretti, K., Hudson, T., Dinda, P., Cui, Z., Xia, L., Bridges, P., Gocke, A., Jaconette, S., Levenhagen, M., and Brightwell, R. Palacios and Kitten: New high performance operating systems for scalable virtualized and native supercomputing. In Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on (April 2010), pp. 1--12.Google ScholarCross Ref
- Liu, R., Klues, K., Bird, S., Hofmeyr, S., Asanović, K., and Kubiatowicz, J. Tessellation: Space-time Partitioning in a Manycore Client OS. In Proceedings of the First USENIX Conference on Hot Topics in Parallelism (2009), HotPar'09, pp. 10--10. Google ScholarDigital Library
- Mucci, P. J., Browne, S., Deane, C., and Ho, G. PAPI: A Portable Interface to Hardware Performance Counters. In In Proceedings of the Department of Defense HPCMP Users Group Conference (1999), pp. 7--10.Google Scholar
- Oral, S., Wang, F., Dillow, D. A., Miller, R., Shipman, G. M., Maxwell, D., Henseler, D., Becklehimer, J., and Larkin, J. Reducing Application Runtime Variability on Jaguar XT5. In In Proceedings of Cray User Group (2010), CUG'10.Google Scholar
- Park, Y., Van Hensbergen, E., Hillenbrand, M., Inglett, T., Rosenburg, B., Ryu, K. D., and Wisniewski, R. FusedOS: Fusing LWK Performance with FWK Functionality in a Heterogeneous Environment. In Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on (Oct 2012), pp. 211--218. Google ScholarDigital Library
- Pritchard, H., Roweth, D., Henseler, D., and Cassella, P. Leveraging the Cray Linux Environment Core Specialization Feature to Realize MPI Asynchronous Progress on Cray XE Systems. In In Proceedings of Cray User Group (2012), CUG'12.Google Scholar
- Shimosawa, T., Gerofi, B., Takagi, M., Nakamura, G., Shirasawa, T., Saeki, Y., Shimizu, M., Hori, A., and Ishikawa, Y. Interface for Heterogeneous Kernels: A Framework to Enable Hybrid OS Designs targeting High Performance Computing on Manycore Architectures. In High Performance Computing (HiPC), 2014 21th International Conference on (Dec 2014), HiPC '14.Google ScholarCross Ref
- Soma, Y., Gerofi, B., and Ishikawa, Y. Revisiting Virtual Memory for High Performance Computing on Manycore Architectures: A Hybrid Segmentation Kernel Approach. In Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers (2014), ROSS '14. Google ScholarDigital Library
- Wisniewski, R. W., Inglett, T., Keppel, P., Murty, R., and Riesen, R. mOS: An Architecture for Extreme-scale Operating Systems. In Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers (New York, NY, USA, 2014), ROSS '14, ACM, pp. 2:1--2:8. Google ScholarDigital Library
- Yoshii, K., Iskra, K., Naik, H., Beckmanm, P., and Broekema, P. C. Characterizing the Performance of Big Memory on Blue Gene Linux. In Proceedings of the 2009 International Conference on Parallel Processing Workshops (2009), ICPPW '09, IEEE Computer Society, pp. 65--72. Google ScholarDigital Library
- Zellweger, G., Gerber, S., Kourtis, K., and Roscoe, T. Decoupling Cores, Kernels, and Operating Systems. In 11th USENIX Symposium on Operating Systems Design and Implementation (Broomfield, CO, Oct. 2014), OSDI '14, pp. 17--31. Google ScholarDigital Library
Index Terms
- Exploring the Design Space of Combining Linux with Lightweight Kernels for Extreme Scale Computing
Recommendations
Linux vs. lightweight multi-kernels for high performance computing: experiences at pre-exascale
SC '21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisThe long standing consensus in the High-Performance Computing (HPC) Operating Systems (OS) community is that lightweight kernel (LWK) based OSes have the potential to outperform Linux at extreme scale. To explore if LWKs live up to their expectation we ...
What is a Lightweight Kernel?
ROSS '15: Proceedings of the 5th International Workshop on Runtime and Operating Systems for SupercomputersLightweight kernels (LWK) have been in use on the compute nodes of supercomputers for decades. Although many high-end systems now run Linux, interest in options and alternatives has increased in the last couple of years. Future extreme-scale systems ...
Comments