skip to main content
10.1145/3319647.3325829acmconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
research-article

FADaC: a self-adapting data classifier for flash memory

Published:22 May 2019Publication History

ABSTRACT

Solid state drives (SSDs) implement a log-structured write pattern, where obsolete data remains stored on flash pages until the flash translation layer (FTL) erases them. erase() operations, however, cannot erase a single page, but target entire flash blocks. Since these victim blocks typically store a mix of valid and obsolete pages, FTLs have to copy the valid data to a new block before issuing an erase() operation. This process therefore increases the latencies of concurrent I/Os and reduces the lifetime of flash memory.

Data classification schemes identify data pages with similar update frequencies and group them together. FTLs can use this grouping to design garbage collection strategies to find victim blocks that have less valid data with respect to having no data classification, and therefore to significantly reduce the number of additional I/Os.

Previous data classification algorithms have been designed without leveraging special features of flash memory and often rely on workload-specific configurations. Our classifier FADaC tunes its parameters online and operates on any given amount of memory by storing additional information within the metadata of flash pages. Additional read() requests for the classification are so few that FADaC reduces the internal flash overhead by up to 45% compared to the best classifier from previous work.

References

  1. Nitin Agrawal, William J. Bolosky, John R. Douceur, and Jacob R. Lorch. 2007. A Five-Year Study of File-System Metadata. In 5th USENIX Conference on File and Storage Technologies, FAST 2007, February 13--16, 2007, San Jose, CA, USA. 31--45. http://www.usenix.org/events/fast07/tech/agrawal.html Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '12, London, United Kingdom, June 11--15, 2012. 53--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Amir Ban. 2004. Wear leveling of static areas in flash memory. https://www.google.com/patents/US6732221 US Patent 6,732,221.Google ScholarGoogle Scholar
  4. Li-Pin Chang. 2007. On efficient wear leveling for large-scale flash-memory storage systems. In Proceedings of the 2007 ACM Symposium on Applied Computing (SAC), Seoul, Korea. 1126--1130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Li-Pin Chang and Tei-Wei Kuo. 2002. An Adaptive Striping Architecture for Flash Memory Storage Systems of Embedded System. In Proceedings of the 8th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), San Jose, CA, USA. 187--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Li-Pin Chang and Tei-Wei Kuo. 2005. Efficient management for large-scale flash-memory storage systems with resource conservation. TOS 1, 4 (2005), 381--418. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Feng Chen, David A. Koufaty, and Xiaodong Zhang. 2009. Understanding intrinsic characteristics and system implications of flash memory based solid state drives. In Proceedings of the Eleventh International Joint Conference on Measurement and Modeling of Computer Systems, SIGMETRICS/Performance 2009, Seattle, WA, USA, June 15--19, 2009. 181--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Mei-Ling Chiang and Ruei-Chuan Chang. 1999. Cleaning policies in mobile computers using flash memory. Journal of Systems and Software 48, 3 (1999), 213--231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mei-Ling Chiang, Paul C. H. Lee, and Ruei-Chuan Chang. 1999. Using Data Clustering to Improve Cleaning Performance for Flash Memory. Softw. Pract. Exper. 29, 3 (March 1999), 267--290. <267::AID-SPE233>3.0.CO;2-T Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mong-Ling Chiao and Da-Wei Chang. 2011. ROSE: A Novel Flash Translation Layer for NAND Flash Memory Based on Hybrid Address Translation. IEEE Trans. Computers 60, 6 (2011), 753--766. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. John Colgrove, John D. Davis, John Hayes, Ethan L. Miller, Cary Sandvig, Russell Sears, Ari Tamches, Neil Vachharajani, and Feng Wang. 2015. Purity: Building Fast, Highly-Available Enterprise Flash Storage from Commodity Components. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31--June 4, 2015. 1683--1694. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Yuhui Deng. 2011. What is the future of disk drives, death or rebirth? ACM Computing Surveys (CSUR) 43, 3 (2011), 23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sarang Dharmapurikar, Praveen Krishnamurthy, and David E. Taylor. 2006. Longest prefix matching using bloom filters. IEEE/ACM Trans. Netw. 14, 2 (2006), 397--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Samsung Electronics. 2014. Samsung V-NAND technology. Technical Report.Google ScholarGoogle Scholar
  15. Li Fan, Pei Cao, Jussara M. Almeida, and Andrei Z. Broder. 1998. Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. In SIGCOMM. 254--265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Aayush Gupta, Youngjae Kim, and Bhuvan Urgaonkar. 2009. DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS Washington, DC, USA. 229--240. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Aayush Gupta, Raghav Pisolkar, Bhuvan Urgaonkar, and Anand Siva-subramaniam. 2011. Leveraging Value Locality in Optimizing NAND Flash-based SSDs. In 9th USENIX Conference on File and Storage Technologies, San Jose, CA, USA. 91--103. http://www.usenix.org/events/fast11/tech/techAbstracts.html#Gupta Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. John A Hartigan and Manchek A Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics) 28, 1 (1979), 100--108.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Benny Van Houdt. 2013. A mean field model for a class of garbage collection algorithms in flash-based solid state drives. In ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '13, Pittsburgh, PA, USA, June 17--21, 2013. 191--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jen-Wei Hsieh, Tei-Wei Kuo, and Li-Pin Chang. 2006. Efficient identification of hot data for flash memory storage systems. TOS 2, 1 (2006), 22--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Xiao-Yu Hu, Evangelos Eleftheriou, Robert Haas, Ilias Iliadis, and Roman Pletka. 2009. Write amplification analysis in flash-based solid state drives. In Proceedings of of SYSTOR 2009: The Israeli Experimental Systems Conference 2009, Haifa, Israel, May 4--6, 2009. 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Xiao-Yu Hu and Robert Haas. 2010. The fundamental limit of flash random write performance: Understanding, analysis and performance modelling. Technical Report. IBM Research Report, 2010/3/31.Google ScholarGoogle Scholar
  23. Soojun Im and Dongkun Shin. 2010. ComboFTL: Improving performance and lifespan of MLC flash memory using SLC flash buffer. Journal of Systems Architecture - Embedded Systems Design 56, 12 (2010), 641--653. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jürgen Kaiser, Fabio Margaglia, and André Brinkmann. 2013. Extending SSD lifetime in database applications with page overwrites. In 6th Annual International Systems and Storage Conference, SYSTOR '13, Haifa, Israel - June 30--July 02, 2013. 11:1--11:12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Atsuo Kawaguchi, Shingo Nishioka, and Hiroshi Motoda. 1995. A Flash-Memory Based File System. In Proceedings of the 1995 Technical Conference on UNIX and Advanced Computing Systems (USENIX), New Orleans, Louisiana, USA. 155--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jesung Kim, Jong Min Kim, Sam H. Noh, Sang Lyul Min, and Yookun Cho. 2002. A space-efficient flash translation layer for CompactFlash systems. IEEE Trans. Consumer Electronics 48, 2 (2002), 366--375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Taejin Kim, Duwon Hong, Sangwook Shane Hahn, Myoungjun Chun, Sungjin Lee, Joo Young Hwang, Jongyoul Lee, and Jihong Kim. 2019. Fully Automatic Stream Management for Multi-Streamed SSDs Using Program Contexts. In 17th USENIX Conference on File and Storage Technologies, FAST Boston, MA, February 25--28. 295--308. https://www.usenix.org/conference/fast19/presentation/kim-taejin Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Youngjae Kim. 2015. An empirical study of redundant array of independent solid-state drives (RAIS). Cluster Computing 18, 2 (2015), 963--977. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Youngjae Kim, Sarp Oral, Galen M. Shipman, Junghee Lee, David Dillow, and Feiyi Wang. 2011. Harmonia: A globally coordinated garbage collector for arrays of Solid-State Drives. In IEEE 27th Symposium on Mass Storage Systems and Technologies, MSST 2011, Denver, Colorado, USA, May 23--27, 2011. 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ohhoon Kwon, Kern Koh, Jaewoo Lee, and Hyokyung Bahn. 2011. FeGC: An efficient garbage collection scheme for flash memory based storage systems. Journal of Systems and Software 84, 9 (2011), 1507--1523. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sang-Won Lee, Dong-Joo Park, Tae-Sun Chung, Dong-Ho Lee, Sang-won Park, and Ha-Joo Song. 2007. A log buffer-based flash translation layer using fully-associative sector translation. ACM Trans. Embedded Comput. Syst. 6, 3 (2007), 18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Lim, S. Lee, and B. Moon. 2010. FASTer FTL for Enterprise-Class Flash Memory SSDs. In International Workshop on Storage Network Architecture and Parallel I/Os. 3--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Youyou Lu, Jiwu Shu, and Weimin Zheng. 2013. Extending the lifetime of flash-based storage through reducing write amplification from file systems. In Proceedings of the 11th USENIX conference on File and Storage Technologies, FAST 2013, San Jose, CA, USA, February 12--15, 2013. 257--270. https://www.usenix.org/node/172699 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Hao Lv, You Zhou, Fei Wu, Weijun Xiao, Xubin He, Zhonghai Lu, and Changsheng Xie. 2018. Exploiting Minipage-Level Mapping to Improve Write Efficiency of NAND Flash. In 2018 IEEE International Conference on Networking, Architecture and Storage, NAS 2018, Chongqing, China, October 11--14, 2018. 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  35. Dongzhe Ma, Jianhua Feng, and Guoliang Li. 2011. LazyFTL: a page-level flash translation layer optimized for NAND flash memory. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2011, Athens, Greece, June 12--16, 2011. 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Fabio Margaglia, Gala Yadgar, Eitan Yaakobi, Yue Li, Assaf Schuster, and André Brinkmann. 2016. The Devil Is in the Details: Implementing Flash Page Reuse with WOM Codes. In 14th USENIX Conference on File and Storage Technologies (FAST), Santa Clara, CA, USA. 95--109. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jai Menon and Larry Stockmeyer. 1998. An age-threshold algorithm for garbage collection in log-structured arrays and file systems. In High Performance Computing Systems and Applications. 119--132.Google ScholarGoogle Scholar
  38. Lars Nagel, Tim Süß, Kevin Kremer, M. Umar Hameed, Lingfang Zeng, and André Brinkmann. 2018. Time-efficient Garbage Collection in SSDs. arXiv:1807.09313Google ScholarGoogle Scholar
  39. Dushyanth Narayanan, Eno Thereska, Austin Donnelly, Sameh Elnikety, and Antony I. T. Rowstron. 2009. Migrating server storage to SSDs: analysis of tradeoffs. In Proceedings of the 2009 EuroSys Conference, Nuremberg, Germany. 145--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Dongchul Park, Biplob K. Debnath, and David H. C. Du. 2011. A Workload-Aware Adaptive Hybrid Flash Translation Layer with an Efficient Caching Strategy. In MASCOTS 2011, 19th Annual IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, Singapore, 25--27 July, 2011. 248--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Dongchul Park and David H. C. Du. 2011. Hot data identification for flash-based storage systems using multiple bloom filters. In IEEE 27th Symposium on Mass Storage Systems and Technologies, MSST, Denver, Colorado, USA. 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Y. Park, J. Choi, C. Kang, C. Lee, Y. Shin, B. Choi, J. Kim, S. Jeon, J. Sel, J. Park, K. Choi, T. Yoo, J. Sim, and K. Kim. 2006. Highly Manufacturable 32Gb Multi - Level NAND Flash Memory with 0.0098 μs 2 Cell Size using TANOS(Si - Ocide - Al203 - TaN) Cell Technology. In 2006 International Electron Devices Meeting. 1--4.Google ScholarGoogle Scholar
  43. P. Pavan, R. Bez, P. Olivo, and E. Zanoni. 1997. Flash memory cells-an overview. Proc. IEEE 85, 8 (1997), 1248--1271.Google ScholarGoogle ScholarCross RefCross Ref
  44. Eunhee Rho, Kanchan Joshi, Seung-Uk Shin, Nitesh Jagadeesh Shetty, Joo Young Hwang, Sangyeun Cho, Daniel D. G. Lee, and Jaeheon Jeong. 2018. FStream: Managing Flash Streams in the File System. In 16th USENIX Conference on File and Storage Technologies, FAST Oakland, CA, USA, February 12--15. 257--264. https://www.usenix.org/conference/fast18/presentation/rho Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Mendel Rosenblum and John K. Ousterhout. 1992. The Design and Implementation of a Log-Structured File System. ACM Trans. Comput. Syst. 10, 1 (1992), 26--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Kent Smith. 2012. Understanding SSD Over Provisioning. Flash Memory Summit, Santa Clara, CA.Google ScholarGoogle Scholar
  47. Radu Stoica and Anastasia Ailamaki. 2013. Improving Flash Write Performance by Using Update Frequency. PVLDB 6, 9 (2013), 733--744. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Hui Sun, Xiao Qin, Fei Wu, and Changsheng Xie. 2013. Measuring and Analyzing Write Amplification Characteristics of Solid State Disks. In 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, San Francisco, CA, USA, August 14--16, 2013. 212--221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Arie Tal. 2002. Two Flash Technologies Compared: NOR vs NAND. Technical Report. M-Systems: Flash Disk Pioneers.Google ScholarGoogle Scholar
  50. Guido Urdaneta, Guillaume Pierre, and Maarten van Steen. 2010. Corrigendum to "Wikipedia workload analysis for decentralized hosting" {Computer Networks 53 (11) (2009) 1830--1845}. Computer Networks 54, 5 (2010), 877--878. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Michael Wu and Willy Zwaenepoel. 1994. eNVy: A Non-Volatile, Main Memory Storage System. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), San Jose, California, USA. 86--97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Gala Yadgar and Moshe Gabel. 2016. Avoiding the Streetlight Effect: I/O Workload Analysis with SSDs in Mind. In 8th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage, Denver, CO, USA, June 20--21. https://www.usenix.org/conference/hotstorage16/workshop-program/presentation/yadgar Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Gala Yadgar, Eitan Yaakobi, and Assaf Schuster. 2015. Write Once, Get 50% Free: Saving SSD Erase Costs Using WOM Codes. In Proceedings of the 13th USENIX Conference on File and Storage Technologies, FAST 2015, Santa Clara, CA, USA, February 16--19, 2015. 257--271. https://www.usenix.org/conference/fast15/technical-sessions/presentation/yadgar Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminathan Sundararaman, Andrew A. Chien, and Haryadi S. Gunawi. 2017. Tiny-Tail Flash: Near-Perfect Elimination of Garbage Collection Tail Latencies in NAND SSDs. TOS 13, 3 (2017), 22:1--22:26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Jingpei Yang, Rajinikanth Pandurangan, Changho Choi, and Vijay Balakrishnan. 2017. AutoStream: automatic stream management for multi-streamed SSDs. In Proceedings of the 10th ACM International Systems and Storage Conference, SYSTOR, Haifa, Israel, May 22--24. 3:1--3:11. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. FADaC: a self-adapting data classifier for flash memory

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SYSTOR '19: Proceedings of the 12th ACM International Conference on Systems and Storage
          May 2019
          211 pages
          ISBN:9781450367493
          DOI:10.1145/3319647

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 May 2019

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate94of285submissions,33%

          Upcoming Conference

          SYSTOR '24
          The 17th ACM International Systems and Storage Conference
          September 23 - 25, 2024
          Tel-Aviv , Israel

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader