ABSTRACT
The emerging of cloud file sharing systems has been motivated by real user needs for data sharing. There are many solutions providing such sharing support all having the common goal of being widely scalable while providing users with consistent shared data. However, offering consistent data is at odds with scalability as it requires many messages and available network bandwith for file transfer.
Network bandwidth can be minimized using several techniques such as compression, deduplication[10], delta encoding[9], etc. However, these approaches do not take into account that not all files must be fully consistent at all times for all users.
In this paper we further increase the scalability of a cloud file sharing system, called vfcBOX, by taking into account the notion of users interest. This means that vfcBOX considers users' consistency needs regarding shared files, to avoid sending useless (or unnecessary) data through the network. As a matter of fact, some files do not need to be constantly propagated to all users, because some of them do not require such immediacy given the particular semantics of the shared data.
vfcBOX uses not only deduplication techniques to minimize network usage but also a consistency model that takes into account the users' interests. The result is a scalable and efficient cloud file sharing system that fulfills users needs regarding data sharing.
- M. Armbrust, A. Fox, R. Griffith, A. Joseph, and R. Katz. A view of cloud computing. In Magazine Communications of the ACM, Volume 53 Issue 4:50--58, 2010. Google ScholarDigital Library
- S. Balasubramaniam and B. Pierce. What is a file synchronizer. In MobiCom '98: Proceedings of the 4th annual ACM/IEEE international conference on Mobile computing and networking, 1998. Google ScholarDigital Library
- J. Barreto and P. Ferreira. A replicated file system for resource constrained mobile devices. In Proceedings of IADIS International Conference on Applied Computing, 2004.Google Scholar
- J. Barreto and P. Ferreira. Efficient locally trackable deduplication in replicated systems. In Middleware'09: Proceedings of the ACM/IFIP/USENIX 10th international conference on Middleware, 2009. Google ScholarDigital Library
- L. Cox, C. Murray, and B. Noble. Pastiche: Making backup cheap and easy. In OSDI '02: Proceedings of the 5th symposium on Operating systems design and implementation, pages 285--298, 2002. Google ScholarDigital Library
- D. E. Eastlake and P. E. Jones. Us secure hash algorithm 1 (sha1). http://www.ietf.org/rfc/rfc3174.txt?number=3174, 2001. Google ScholarDigital Library
- Collins-Sussman et al. Version control with subversion. O'Reilly, 2004. Google ScholarDigital Library
- K. Morse et al. Interest management in large-scale distributed simulations. Information and Computer Science, University of California, Irvine, 1996.Google Scholar
- J. J. Hunt, K.-P. Vo, and W. F. Tichy. An empirical study of delta algorithms. In ICSE '96: Proceedings of the SCM-6 Workshop on System Configuration Management, pages 49--66, 1996. Google ScholarDigital Library
- N. Mandagere, P. Zhou, M. Smith, and S. Uttamchandani. Demystifying data deduplication. In Companion '08: Proceedings of the ACM/IFIP/USENIX Middleware '08 Conference Companion, 2008. Google ScholarDigital Library
- A. Muthitacharoen, B. Chen, and D. Mazières. A low-bandwidth network file system. In SOSP '01: Proceedings of the eighteenth ACM symposium on Operating systems principles, Volume 35 Issue 4:174--187, 2001. Google ScholarDigital Library
- M. Palankar, A. Iamnitchi, M. Ripeanu, and S. Garfinkel. Amazon s3 for science grids: a viable solution? In DADC '08: Proceedings of the 2008 international workshop on Data-aware distributed computing, 2008. Google ScholarDigital Library
- M. Rabin. Fingerprinting by random polynomials. Technical Report TR-15-81, Center for Research in Computing Technology, Harvard University, 1981.Google Scholar
- Y. Saito and M. Shapiro. Optimistic replication. In Journal ACM Computing Surveys (CSUR), Volume 37 Issue 1(1):42--81, 2005. Google ScholarDigital Library
- A. Tridgell and P. Mackerras. The rsync algorithm. Australian National University, 1998.Google Scholar
- L. Veiga and P. Ferreira. Semantic-chunks: A middleware for ubiquitous cooperative work. In ARM '05 Proceedings of the 4th workshop on Reflective and Adaptive Middleware Systems, 2005. Google ScholarDigital Library
Index Terms
- vfcBOX: multi-user consistent file sharing
Recommendations
Read-Performance Optimization for Deduplication-Based Storage Systems in the Cloud
Data deduplication has been demonstrated to be an effective technique in reducing the total data transferred over the network and the storage space in cloud backup, archiving, and primary storage systems, such as VM (virtual machine) platforms. However, ...
Leveraging data deduplication to improve the performance of primary storage systems in the cloud
SOCC '13: Proceedings of the 4th annual Symposium on Cloud ComputingRecent studies have shown that moderate to high data redundancy exists in primary storage systems, such as VM-based, enterprise and HPC storage systems, which indicates that the data deduplication technology can be used to effectively reduce the write ...
Improving runtime performance of deduplication system with host-managed SMR storage drives
DAC '18: Proceedings of the 55th Annual Design Automation ConferenceDue to the cost consideration for data storage, high-areal-density shingled-magnetic-recording (SMR) drives and data deduplication techniques are getting popular in many data storage services for the improvement of profit per storage unit. However, ...
Comments