skip to main content
10.1145/1294261.1294281acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
Article

Dynamo: amazon's highly available key-value store

Published:14 October 2007Publication History

ABSTRACT

Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. The Amazon.com platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. At this scale, small and large components fail continuously and the way persistent state is managed in the face of these failures drives the reliability and scalability of the software systems.

This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience. To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.

Skip Supplemental Material Section

Supplemental Material

1294281.mp4

mp4

187.6 MB

References

  1. Adya, A., Bolosky, W. J., Castro, M., Cermak, G., Chaiken, R., Douceur, J. R., Howell, J., Lorch, J. R., Theimer, M., and Wattenhofer, R. P. 2002. Farsite: federated, available, and reliable storage for an incompletely trusted environment. SIGOPS Oper. Syst. Rev. 36, SI (Dec. 2002), 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bernstein, P.A., and Goodman, N. An algorithm for concurrency control and recovery in replicated distributed databases. ACM Trans. on Database Systems, 9(4): 596--615, December 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., and Gruber, R.Google ScholarGoogle Scholar
  4. Douceur, J. R. and Bolosky, W. J. 2000. Process-based regulation of low-importance processes. SIGOPS Oper. Syst. Rev. 34, 2 (Apr. 2000), 26--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Fox, A., Gribble, S. D., Chawathe, Y., Brewer, E. A., and Gauthier, P. 1997. Cluster-based scalable network services. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles (Saint Malo, France, October 05 -- 08, 1997). W. M. Waite, Ed. SOSP '97. ACM Press, New York, NY, 78--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ghemawat, S., Gobioff, H., and Leung, S. 2003. The Google file system. In Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (Bolton Landing, NY, USA, October 19 -- 22, 2003). SOSP '03. ACM Press, New York, NY, 29--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gray, J., Helland, P., O'Neil, P., and Shasha, D. 1996. The dangers of replication and a solution. In Proceedings of the 1996 ACM SIGMOD international Conference on Management of Data (Montreal, Quebec, Canada, June 04 -- 06, 1996). J. Widom, Ed. SIGMOD '96. ACM Press, New York, NY, 173--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gupta, I., Chandra, T. D., and Goldszmidt, G. S. 2001. On scalable and efficient distributed failure detectors. In Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing (Newport, Rhode Island, United States). PODC '01. ACM Press, New York, NY, 170--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Wells, C., and Zhao, B. 2000. OceanStore: an architecture for global--scale persistent storage. SIGARCH Comput. Archit. News 28, 5 (Dec. 2000), 190--201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Karger, D., Lehman, E., Leighton, T., Panigrahy, R., Levine, M., and Lewin, D. 1997. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web. In Proceedings of the Twenty--Ninth Annual ACM Symposium on theory of Computing (El Paso, Texas, United States, May 04 -- 06, 1997). STOC '97. ACM Press, New York, NY, 654--663. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Lindsay, B.G., et. al., "Notes on Distributed Databases", Research Report RJ2571(33471), IBM Research, July 1979.Google ScholarGoogle Scholar
  12. Lamport, L. Time, clocks, and the ordering of events in a distributed system. ACM Communications, 21(7), pp. 558--565, 1978. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Merkle, R. A digital signature based on a conventional encryption function. Proceedings of CRYPTO, pages 369--378. Springer-Verlag, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ramasubramanian, V., and Sirer, E. G. Beehive: O(1)lookup performance for power-law query distributions in peer-to-peer overlays. In Proceedings of the 1st Conference on Symposium on Networked Systems Design and Implementation, San Francisco, CA, March 29-31, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Reiher, P., Heidemann, J., Ratner, D., Skinner, G., and Popek, G. 1994. Resolving file conflicts in the Ficus file system. In Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference -- Volume 1 (Boston, Massachusetts, June 06-10, 1994). USENIX Association, Berkeley, CA, 12--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Rowstron, A., and Druschel, P. Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems. Proceedings of Middleware, pages 329--350, November, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Rowstron, A., and Druschel, P. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. Proceedings of Symposium on Operating Systems Principles, October 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Saito, Y., Frølund, S., Veitch, A., Merchant, A., and Spence, S. 2004. FAB: building distributed enterprise disk arrays from commodity components. SIGOPS Oper. Syst. Rev. 38, 5 (Dec. 2004), 48--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Satyanarayanan, M., Kistler, J.J., Siegel, E.H. Coda: A Resilient Distributed File System. IEEE Workshop on Workstation Operating Systems, Nov. 1987.Google ScholarGoogle Scholar
  20. Stoica, I., Morris, R., Karger, D., Kaashoek, M. F., and Balakrishnan, H. 2001. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols For Computer Communications (San Diego, California, United States). SIGCOMM '01. ACM Press, New York, NY, 149--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Terry, D. B., Theimer, M. M., Petersen, K., Demers, A. J., Spreitzer, M. J., and Hauser, C. H. 1995. Managing update conflicts in Bayou, a weakly connected replicated storage system. In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles (Copper Mountain, Colorado, United States, December 03 -- 06, 1995). M. B. Jones, Ed. SOSP '95. ACM Press, New York, NY, 172--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Thomas, R. H. A majority consensus approach to concurrency control for multiple copy databases. ACM Transactions on Database Systems 4 (2): 180--209, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Weatherspoon, H., Eaton, P., Chun, B., and Kubiatowicz, J. 2007. Antiquity: exploiting a secure log for wide-area distributed storage. SIGOPS Oper. Syst. Rev. 41, 3 (Jun. 2007), 371--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Welsh, M., Culler, D., and Brewer, E. 2001. SEDA: an architecture for well-conditioned, scalable internet services. In Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles (Banff, Alberta, Canada, October 21 -- 24, 2001). SOSP '01. ACM Press, New York, NY, 230--243. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dynamo: amazon's highly available key-value store

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SOSP '07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
              October 2007
              378 pages
              ISBN:9781595935915
              DOI:10.1145/1294261
              • cover image ACM SIGOPS Operating Systems Review
                ACM SIGOPS Operating Systems Review  Volume 41, Issue 6
                SOSP '07
                December 2007
                363 pages
                ISSN:0163-5980
                DOI:10.1145/1323293
                Issue’s Table of Contents

              Copyright © 2007 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 14 October 2007

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              Overall Acceptance Rate131of716submissions,18%

              Upcoming Conference

              SOSP '24

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader