ABSTRACT
The tremendous growth of RAM capacity - now exceeding multiple terabytes - necessitates a reevaluation of traditional memory-management methods, which were developed when resources were scarce. Current virtual-memory subsystems handle address-space regions as sets of individual 4-KiB pages with demand paging and copy-on-write, resulting in significant management overhead. Although huge pages reduce the number of managed entities, they induce internal fragmentation and have a coarse copy granularity.
To address these problems, we introduce Morsels, a novel virtual-memory-management paradigm that is purely based on hardware data structures and enables the efficient sharing of virtual-memory objects between processes and devices while being well suited for non-volatile memory. Our benchmarks show that Morsels reduce the mapping time for a 6.82-GiB machine-learning model by up to 99.8 percent compared to conventional memory mapping in Linux.
- A. Bensoussan, C. T. Clingen, and R. C. Daley. "The multics virtual memory". In: Proceedings of the second symposium on Operating systems principles. SOSP '69. New York, NY, USA: Association for Computing Machinery, Oct. 1969, 30--42. isbn: 978-1-4503-7456-9. Google ScholarDigital Library
- Daniel Bittman, Peter Alvaro, Pankaj Mehra, Darrell D. E. Long, and Ethan L. Miller. "Twizzler: a Data-Centric OS for Non-Volatile Memory". In: 2020 USENIX Annual Technical Conference (USENIX ATC '20). USENIX Association, July 2020, pp. 65--80. isbn: 978-1-93913314-4. url: https://www.usenix.org/conference/atc20/presentation/bittman.Google Scholar
- Jeffrey S. Chase, Henry M. Levy, Michael J. Feeley, and Edward D. Lazowska. "Sharing and Protection in a Single-Address-Space Operating System". In: ACM Trans. Comput. Syst. 12.4 (1994), 271--307. issn: 0734-2071. Google ScholarDigital Library
- Compute Express Link Consortium, Inc. CXL Specification, Revision 2.0. Oct. 2020.Google Scholar
- Alan Dearle, Rex di Bona, James Farrow, Frans Henskens, Anders Lindström, John Rosenberg, and Francis Vaughan. "Grasshopper: An Orthogonally Persistent Operating System". In: Comput. Syst. 7.3 (1994), 289--312. issn: 0895-6340.Google ScholarDigital Library
- Izzat El Hajj et al. "SpaceJMP: Programming with Multiple Virtual Address Spaces". In: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS '16. Atlanta, Georgia, USA: Association for Computing Machinery, 2016, 353--368. isbn: 9781450340915. Google ScholarDigital Library
- Dawson R. Engler, M. Frans Kaashoek, and James O'Toole. "Exokernel: An Operating System Architecture for Application-Level Resource Management". In: Proceedings of the 15th ACM Symposium on Operating Systems Principles (SOSP '95) (Copper Mountain, CO, USA). New York, NY, USA: ACM Press, Dec. 1995, pp. 251--266. isbn: 0-89791-715-4. Google ScholarDigital Library
- Brad Fitzpatrick. "Distributed Caching with Mem-cached". In: Linux Journal 2004.124 (Aug. 2004), pp. 5--. issn: 1075--3583. url: http://dl.acm.org/citation.cfm?id=1012889.1012894.Google ScholarDigital Library
- Gen-Z Consortium. Gen-Z Core Specification, Revision 1.1. Oct. 2020.Google Scholar
- Georgi Gerganov. llama.cpp: Port of Facebook's LLaMA model in C/C++. June 2023. url: https://github.com/ggerganov/llama.cpp.Google Scholar
- Brendan Gregg. "The flame graph". In: Communications of the ACM 59.6 (2016), pp. 48--57.Google ScholarDigital Library
- Gernot Heiser, Kevin Elphinstone, Jerry Vochteloo, Stephen Russel, and Jochen Liedtke. "The Mungi Single-Address-Space Operating System". In: Software: Practice and Experience 18.9 (July 1998).Google Scholar
- Merle E. Houdek, Frank G. Soltis, and Roy L. Hoffman. "IBM System/38 Support for Capability-Based Addressing". In: Proceedings of the 8th Annual Symposium on Computer Architecture (ISCA). ISCA '81. Minneapolis, Minnesota, USA: IEEE Computer Society Press, 1981, 341--348.Google ScholarDigital Library
- Tom Jobbins. Selfee-13B-GGML-DOI (Revision 4dd57ef). 2023. url: https://huggingface.co/TheBloke/Selfee-13B-GGML-DOI. Google ScholarCross Ref
- Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel. "Coordinated and Efficient Huge Page Management with Ingens". In: 12th Symposium on Operating Systems Design and Implementation (OSDI '16). Savannah, GA, USA: USENIX Association, 2016, 705--721. isbn: 9781931971331.Google Scholar
- Viktor Leis, Adnan Alhomssi, Tobias Ziegler, Yannick Loeck, and Christian Dietrich. "Virtual-Memory Assisted Buffer Management". In: Proceedings of the ACM SIGMOD/PODS International Conference on Management of Data (SIGMOD'23). Seattle, WA, USA: ACM, June 2023. Google ScholarDigital Library
- Juan Navarro, Sitaram Iyer, and Alan Cox. "Practical, Transparent Operating System Support for Superpages". In: 5th Symposium on Operating Systems Design and Implementation (OSDI '02). Boston, MA: USENIX Association, Dec. 2002.Google Scholar
- Ismail Oukid, Daniel Booss, Adrien Lespinasse, Wolfgang Lehner, Thomas Willhalm, and Grégoire Gomes. "Memory Management Techniques for Large-Scale Persistent-Main-Memory Systems". In: Proc. VLDB Endow. 10.11 (2017), pp. 1166--1177. url: http://www.vldb.org/pvldb/vol10/p1166-oukid.pdf. Google ScholarDigital Library
- Ashish Panwar, Aravinda Prasad, and K. Gopinath. "Making Huge Pages Actually Useful". In: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems. ASPLOS '18. Williamsburg, VA, USA: Association for Computing Machinery, 2018, 679--692. isbn: 9781450349116. url: https://doi.org/10.1145/3173162.3173203. Google ScholarDigital Library
- Omer Peleg, Adam Morrison, Benjamin Serebrin, and Dan Tsafrir. "Utilizing the IOMMU Scalably". In: 2015 USENIX Annual Technical Conference (USENIX ATC '15). Santa Clara, CA: USENIX Association, July 2015, pp. 549--562. isbn: 978-1-931971-225. url: https://www.usenix.org/conference/atc15/technical-session/presentation/peleg.Google Scholar
- Steven Pelley, Peter M. Chen, and Thomas F. Wenisch. "Memory Persistency". In: Proceeding of the 41st Annual International Symposium on Computer Architecture (ISCA '14). Minneapolis, Minnesota, USA: IEEE Press, 2014, 265--276. isbn: 9781479943944.Google ScholarCross Ref
- Richard F. Rashid and George G. Robertson. "Accent: A Communication Oriented Network Operating System Kernel". In: Proceedings of the 8th ACM Symposium on Operating Systems Principles (SOSP '81). New York, NY, USA: ACM Press, 1981, pp. 64--75. isbn: 0-89791-062-1. Google ScholarDigital Library
- Richard Rashid, Avadis Tevanian, Michael Young, David Golub, Robert Baron, David Black, William Bolosky, and Jonathan Chew. "Machine-Independent Virtual Memory Management for Paged Uniprocessor and Multiprocessor Architectures". In: Proceedings of the Second International Conference on Architectual Support for Programming Languages and Operating Systems (ASPLOS '87). ASPLOS '87. Palo Alto, California, USA: IEEE Computer Society Press, 1987, 31--39. isbn: 0818608056. Google ScholarDigital Library
- Redislab. Redis. http://redis.io, visited 2019-07-21. 2019. (Visited on 07/21/2019).Google Scholar
- Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang. "LegoOS: A Disseminated, Distributed OS for Hardware Resource Disaggregation". In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). Carlsbad, CA: USENIX Association, Oct. 2018, pp. 69--87. isbn: 978-1-939133-08-3. url: https://www.usenix.org/conference/osdi18/presentation/shan.Google Scholar
- Jonathan S. Shapiro and Jonathan Adams. "Design Evolution of the EROS Single-Level Store". In: Proceedings of the General Track of the Annual Conference on USENIX Annual Technical Conference. ATEC '02. USA: USENIX Association, 2002, 59--72. isbn: 1880446006.Google Scholar
- Frank G. Soltis. Inside the AS/400. Loveland, Colorado: 29th Street Press, 1996. isbn: 1-882419-13-8.Google ScholarDigital Library
- H. Tezuka, F. O'Carroll, A. Hori, and Y. Ishikawa. "Pin-down cache: a virtual memory management technique for zero-copy communication". In: Proceedings of the First Parallel Processing Symposium and Symposium on Parallel and Distributed Processing (IPPS '98). 1998, pp. 308--314. Google ScholarCross Ref
- Hugo Touvron et al. LLaMA: Open and Efficient Foundation Language Models. 2023. arXiv: 2302.13971 [cs.CL].Google Scholar
- Ján Veselý, Arkaprava Basu, Abhishek Bhattacharjee, Gabriel H. Loh, Mark Oskin, and Steven K. Reinhardt. "Generic System Calls for GPUs". In: 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 2018, pp. 843--856. Google ScholarDigital Library
- Pirmin Vogel. "Shared Virtual Memory for Heterogeneous Embedded Systems on Chip". en. PhD thesis. Zurich: ETH Zurich, 2018. isbn: 978-3-86628-623-8. Google ScholarCross Ref
- Lars Wrenger, Florian Rommel, Alexander Halbuer, Christian Dietrich, and Daniel Lohmann. "LLFree: Scalable and Optionally-Persistent Page-Frame Allocation". In: 2023 USENIX Annual Technical Conference (USENIX '23). Boston, MA: USENIX Association, July 2023, pp. 897--914. isbn: 978-1-939133-35-9. url: https://www.usenix.org/conference/atc23/presentation/wrenger.Google Scholar
- Seonghyeon Ye, Yongrae Jo, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, and Minjoon Seo. SelFee: Iterative Self-Revising LLM Empowered by Self-Feedback Generation. Blog post. 2023. url: https://kaistai.github.io/SelFee/.Google Scholar
- Kaiyang Zhao, Sishuai Gong, and Pedro Fonseca. "On-Demand-Fork: A Microsecond Fork for Memory-Intensive and Latency-Sensitive Applications". In: Proceedings of the Sixteenth European Conference on Computer Systems. EuroSys '21. Online Event, United Kingdom: Association for Computing Machinery, 2021, 540--555. isbn: 978-1-4503-8334-9. url: https://doi.org/10.1145/3447786.3456258. Google ScholarDigital Library
Recommendations
Energy-aware flash memory management in virtual memory system
The traditional virtual memory system is designed for decades assuming a magnetic disk as the secondary storage. Recently, flash memory becomes a popular storage alternative for many portable devices with the continuing improvements on its capacity, ...
Cooperating Write Buffer Cache and Virtual Memory Management for Flash Memory Based Systems
RTAS '11: Proceedings of the 2011 17th IEEE Real-Time and Embedded Technology and Applications SymposiumFlash memory is becoming the storage media of choice for mobile devices and embedded systems. The performance of flash memory is impacted by the asymmetric speed of read and write operations, limited number of erase times and the absence of in-place ...
An efficient garbage collection for flash memory-based virtual memory systems
As more consumer electronics adopt monolithic kernels, NAND flash memory is used for the swap space in virtual memory systems. While flash memory has the advantages of low-power consumption, shock-resistance and non-volatility, it requires garbage ...
Comments