Abstract
Duplication of data in storage systems is becoming increasingly common. We introduce I/O Deduplication, a storage optimization that utilizes content similarity for improving I/O performance by eliminating I/O operations and reducing the mechanical delays during I/O operations. I/O Deduplication consists of three main techniques: content-based caching, dynamic replica retrieval, and selective duplication. Each of these techniques is motivated by our observations with I/O workload traces obtained from actively-used production storage systems, all of which revealed surprisingly high levels of content similarity for both stored and accessed data. Evaluation of a prototype implementation using these workloads showed an overall improvement in disk I/O performance of 28 to 47% across these workloads. Further breakdown also showed that each of the three techniques contributed significantly to the overall performance improvement.
- }}Akyurek, S. and Salem, K. 1995. Adaptive block rearrangement. Comput. Syst. 13, 2, 89--121. Google ScholarDigital Library
- }}Axboe, J. 2007. Blktrace User Guide.Google Scholar
- }}Bhadkamkar, M., Guerra, J., Useche, L., Burnett, S., Liptak, J., Rangaswami, R., and Hristidis, V. 2009. BORG: BlockreORGanization for self-optimizing storage systems. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Bitton, D. and Gray, J. 1988. Disk shadowing. In Proceedings of the International Conference on Very Large Data Bases. Google ScholarDigital Library
- }}Bloom, B. H. 1970. Space/time trade-offs in hash coding with allowable errors. Comm. ACM 13, 7, 422--426. Google ScholarDigital Library
- }}Brin, S., Davis, J., and Garcia-Molina, H. 1995. Copy detection mechanisms for digital documents. In Proceedings of the ACM SIGMOD Conference. ACM, New York. Google ScholarDigital Library
- }}Clements, A., Ahmad, I., Vilayannur, M., and Li, J. 2009. Decentralized deduplication in SAN cluster file systems. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Ellard, D., Ledlie, J., Malkani, P., and Seltzer, M. 2003. Passive NFS tracing of email and research workloads. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}EMC Corporation. EMC Invista. http://www.emc.com/products/software/invista/invista.jsp.Google Scholar
- }}Gill, B. S. 2008. On multi-level exclusive caching: offline optimality and why promotions are better than demotions. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Gray, J. and Shenoy, P. 2000. Rules of thumb in data engineering. In Proceedings of the IEEE International Conference on Data Engineering. IEEE, Wshington, D.C. Google ScholarDigital Library
- }}Guerra, J., Useche, L., Bhadkamkar, M., Koller, R., and Rangaswami, R. 2008. The case for active block layer extensions. ACM Oper. Syst. Rev. 42, 6. Google ScholarDigital Library
- }}Gupta, D., Lee, S., Vrable, M., Savage, S., Snoeren, A. C., Varghese, G., Voelker, G., and Vahdat, A. 2008. Difference engine: Harnessing memory redundancy in virtual machines. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Hsu, W. W., Smith, A. J., and Young, H. C. 2005. The automatic improvement of locality in storage systems. ACM Trans. Comput. Syst. 23, 4, 424--473. Google ScholarDigital Library
- }}Huang, H., Hung, W., and Shin, K. G. 2005. FS2: Dynamic data replication in free disk space for improving disk performance and energy consumption. In Proceedings of the ACM SOSP. ACM, New York. Google ScholarDigital Library
- }}IBM Corporation. IBM system storage SAN volume controller. http://www03.ibm.com/systems/storage/software/virtualization/svc/. Google ScholarDigital Library
- }}Jain, N., Dahlin, M., and Tewari, R. 2005. TAPER: Tiered approach for eliminating redundancy in replica synchronization. In Proceedings of the USENIX Conference on File and Storage Systems. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Jiang, S., Chen, F., and Zhang, X. 2005. Clock-pro: An effective improvement of the clock replacement. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Kulkarni, P., Douglis, F., LaVoie, J. D., and Tracey, J. M. 2004. Redundancy elimination within large collections of files. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Leung, A., Pasupathy, S., Goodson, G., and Miller, E. 2008. Measurement and analysis of large-scale network file system workloads. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Li, X., Aboulnaga, A., Salem, K., Sachedina, A., and Gao, S. 2005. Second-tier cache management using write hints. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Lillibridge, M., Eshghi, K., Bhagwat, D., Deolalikar, V., Trezise, G., and Camble, P. 2009. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Mattson, R. L., Gecsei, J., Slutz, D. R., and Traiger, I. L. 1970. Evaluation techniques for storage hierarchies. IBM Syst. J. 9, 2, 78--117. Google ScholarDigital Library
- }}Megiddo, N. and Modha, D. S. 2003. Arc: A self-tuning, low overhead replacement cache. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Milos, G., Murray, D. G., Hand, S., and Fetterman, M. 2009. Satori: Enlightened page sharing. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Morrey, C. B., III, and Grunwald, D. 2003. Peabody: The time travelling disk. In Proceedings of the IEEE/NASA MSST. IEEE, Los Alamitos, CA. Google ScholarDigital Library
- }}Muthitacharoen, A., Chen, B., and Mazières, D. 2001. A low-bandwidth network file system. In Proceedings of the ACM SOSP. ACM, New York. Google ScholarDigital Library
- }}Network Appliance, Inc. NetApp V-series of heterogeneous storage environments. http://media.netapp.com/documents/v-series.pdf.Google Scholar
- }}Orji, C. U. and Solworth, J. A. 1993. Doubly distorted mirrors. In Proceedings of the ACM SIGMOD. ACM, New York. Google ScholarDigital Library
- }}Quinlan, S. and Dorward, S. 2002. Venti: A new approach to archival storage. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Rhea, S., Cox, R., and Pesterev, A. 2008. Fast, inexpensive content-addressed storage in foundation. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Ruemmler, C. and Wilkes, J. 1991. Disk shuffling. Tech. rep. HPL-CSP-91-30, Hewlett-Packard Laboratories.Google Scholar
- }}Solworth, J. A. and Orji, C. U. 1991. Distorted mirrors. In Proceedings of the 1st International Conference on Parallel and Distributed Information Systems (PDIS). Google ScholarDigital Library
- }}Tolia, N., Kozuch, M., Satyanarayanan, M., Karp, B., and Bressoud, T. 2003. Opportunistic use of content addressable storage for distributed file systems. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, Monterey, CA.Google Scholar
- }}Vongsathorn, P. and Carson, S. D. 1990. A system for adaptive disk rearrangement. Softw. Pract. Exper. 20, 3, 225--242. Google ScholarDigital Library
- }}Waldspurger, C. A. 2002. Memory resource management in VMware ESX server. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Wong, C. K. 1980. Minimizing expected head movement in one-dimensional and two-dimensional mass storage systems. ACM Comput. Surv. 12, 2, 167--178. Google ScholarDigital Library
- }}Wong, T. M. and Wilkes, J. 2002. My cache or yours? Making storage more exclusive. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Yu, X., Gum, B., Chen, Y., Wang, R. Y., Li, K., Krishnamurthy, A., and Anderson, T. E. 2000. Trading capacity for performance in a disk array. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Zhang, C., Yu, X., Krishnamurthy, A., and Wang, R. Y. 2002. Configuring and scheduling an eager-writing disk array for a transaction processing workload. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Monterey, CA. Google ScholarDigital Library
- }}Zhu, B., Li, K., and Patterson, H. 2008. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the USENIX Conference on File and Storage Technologies. USENIX Association, Monterey, CA. Google ScholarDigital Library
Index Terms
- I/O Deduplication: Utilizing content similarity to improve I/O performance
Recommendations
Read-Performance Optimization for Deduplication-Based Storage Systems in the Cloud
Data deduplication has been demonstrated to be an effective technique in reducing the total data transferred over the network and the storage space in cloud backup, archiving, and primary storage systems, such as VM (virtual machine) platforms. However, ...
Content Look-Aside Buffer for Redundancy-Free Virtual Disk I/O and Caching
VEE '17: Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution EnvironmentsStorage consolidation in a virtualized environment introduces numerous duplications in virtual disks and imposes considerable pressure on disk I/O and caching. In this paper, we present a content look-aside buffer (CLB) approach for simultaneously ...
HPDA: A hybrid parity-based disk array for enhanced performance and reliability
Flash-based Solid State Drive (SSD) has been productively shipped and deployed in large scale storage systems. However, a single flash-based SSD cannot satisfy the capacity, performance and reliability requirements of the modern storage systems that ...
Comments