skip to main content
10.1145/2694344.2694359acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open Access

Asynchronized Concurrency: The Secret to Scaling Concurrent Search Data Structures

Published:14 March 2015Publication History

ABSTRACT

We introduce "asynchronized concurrency (ASCY)," a paradigm consisting of four complementary programming patterns. ASCY calls for the design of concurrent search data structures (CSDSs) to resemble that of their sequential counterparts. We argue that ASCY leads to implementations which are portably scalable: they scale across different types of hardware platforms, including single and multi-socket ones, for various classes of workloads, such as read-only and read-write, and according to different performance metrics, including throughput, latency, and energy. We substantiate our thesis through the most exhaustive evaluation of CSDSs to date, involving 6 platforms, 22 state-of-the-art CSDS algorithms, 10 re-engineered state-of-the-art CSDS algorithms following the ASCY patterns, and 2 new CSDS algorithms designed with ASCY in mind. We observe up to 30% improvements in throughput in the re-engineered algorithms, while our new algorithms out-perform the state-of-the-art alternatives.

References

  1. Dan Alistarh, Patrick Eugster, Maurice Herlihy, Alexander Matveev, and Nir Shavit. StackTrack: An Automated Transactional Approach to Concurrent Memory Reclamation. EuroSys 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Maya Arbel and Hagit Attiya. Concurrent Updates with RCU: Search Tree As an Example. PODC 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Andrea Arcangeli, Mingming Cao, Paul E McKenney, and Dipankar Sarma. Using Read-Copy-Update Techniques for System V IPC in the Linux 2.5 Kernel. USENIX ATC 2003.Google ScholarGoogle Scholar
  4. Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schupbach, and Akhilesh Singhania. The multikernel: a new OS architecture for scalable multicore systems. SOSP 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Silas Boyd-Wickizer, Austin T Clements, Yandong Mao, Aleksey Pesterev, M Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. An Analysis of Linux Scalability to Many Cores. OSDI 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Anastasia Braginsky, Alex Kogan, and Erez Petrank. Drop the anchor: lightweight memory management for non-blocking data structures. SPAA 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Nathan G Bronson, Jared Casper, Hassan Chafi, and Kunle Olukotun. A Practical Concurrent Binary Search Tree. PPoPP 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Austin T Clements, M Frans Kaashoek, and Nickolai Zeldovich. Scalable address spaces using RCU balanced trees. In ACM SIGARCH Computer Architecture News, volume 40, pages 199--210. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Austin T Clements, M Frans Kaashoek, Nickolai Zeldovich, Robert T Morris, and Eddie Kohler. The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors. SOSP 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Pat Conway, Nathan Kalyanasundharam, Gregg Donley, Kevin Lepak, and Bill Hughes. Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor. IEEE Micro, 30(2):16--29, March 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Tudor David, Rachid Guerraoui, Tong Che, and Vasileios Trigonakis. Designing ASCY-compliant Concurrent Search Data Structures. Technical report, EPFL, Lausanne, 2014.Google ScholarGoogle Scholar
  12. Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask. SOSP 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Mathieu Desnoyers, Paul E McKenney, Alan S Stern, Michel R Dagenais, and Jonathan Walpole. User-level implementations of read-copy update. Parallel and Distributed Systems, IEEE Transactions on, 23(2):375--382, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. David L Detlefs, Paul A Martin, Mark Moir, and Guy L Steele Jr. Lock-free reference counting. Distributed Computing, 15(4):255--271, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dana Drachsler, Martin Vechev, and Eran Yahav. Practical Concurrent Binary Search Trees via Logical Ordering. PPoPP 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Aleksandar Dragojevic, Maurice Herlihy, Yossi Lev, and Mark Moir. On the power of hardware transactional memory to simplify memory management. PODC 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Faith Ellen, Panagiota Fatourou, Eric Ruppert, and Franck van Breugel. Non-blocking Binary Search Trees. PODC 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Facebook. RocksDB. http://rocksdb.org.Google ScholarGoogle Scholar
  19. Bin Fan, David G Andersen, and Michael Kaminsky. MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing. NSDI 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Keir Fraser. Practical Lock-Freedom. PhD thesis, University of Cambridge, 2004.Google ScholarGoogle Scholar
  21. Anders Gidenstam, Marina Papatriantafilou, Hakan Sundell, and Philippas Tsigas. Efficient and reliable lock-free memory reclamation based on reference counting. Parallel and Distributed Systems, IEEE Transactions on, 20(8):1173--1187, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Vincent Gramoli. More than You Ever Wanted to Know about Synchronization. PPoPP 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Timothy L Harris. A Pragmatic Implementation of Non-blocking Linked Lists. DISC 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Thomas E Hart, Paul E McKenney, Angela Demke Brown, and Jonathan Walpole. Performance of memory reclamation for lockless synchronization. Journal of Parallel and Distributed Computing, 67(12):1270--1285, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Steve Heller, Maurice Herlihy, Victor Luchangco, Mark Moir, III Scherer, William N, and Nir Shavit. A Lazy Concurrent List-Based Set Algorithm. In Principles of Distributed Systems, volume 3974. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Maurice Herlihy, Yossi Lev, Victor Luchangco, and Nir Shavit. A simple optimistic skiplist algorithm. SIROCCO 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Maurice Herlihy, Victor Luchangco, and Mark Moir. Obstruction-free synchronization: Double-ended queues as an example. ICDCS 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Maurice Herlihy, Victor Luchangco, and Mark Moir. The repeat offender problem: a mechanism for supporting dynamic-sized lock-free data structures. Technical report, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Maurice Herlihy and Nir Shavit. The Art of Multiprocessor Programming, Revised First Edition. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Maurice P Herlihy, Yosef Lev, and Nir N Shavit. Concurrent lock-free skiplist with wait-free contains operator, May 3 2011. US Patent 7,937,378.Google ScholarGoogle Scholar
  31. Maurice P Herlihy and Jeannette M Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3):463--492, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Shane V Howley and Jeremy Jones. A non-blocking internal binary search tree. SPAA 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Nicholas Hunt, Paramjit Singh Sandhu, and Luis Ceze. Characterizing the performance and energy efficiency of lock-free data structures. INTERACT 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Intel. Intel Transactional Synchronization Extensions Overview. 2013.Google ScholarGoogle Scholar
  35. Intel. Intel xeon processor e3-1200 v3 product family - specification update. http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e3-1200v3-spec-update.pdf, 2014.Google ScholarGoogle Scholar
  36. Intel Thread Building Blocks. https://www.threadingbuildingblocks.org.Google ScholarGoogle Scholar
  37. Doug Lea. Overview of package util.concurrent Release 1.3.4. http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html, 2003.Google ScholarGoogle Scholar
  38. Hyeontaek Lim, Bin Fan, David G Andersen, and Michael Kaminsky. SILT: A Memory-efficient, High-performance Key-value Store. SOSP 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Yandong Mao, Eddie Kohler, and Robert Tappan Morris. Cache craftiness for fast multicore key-value storage. EuroSys 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Paul E McKenney, Dipankar Sarma, and Maneesh Soni. Scaling Dcache with RCU. Linux Journal, 2004(117), January 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Paul E McKenney and John D Slingwine. Read-copy update: Using execution history to solve concurrency problems. In Parallel and Distributed Computing and Systems, pages 509--518, 1998.Google ScholarGoogle Scholar
  42. Memcached. http://www.memcached.org.Google ScholarGoogle Scholar
  43. Zviad Metreveli, Nickolai Zeldovich, and M Frans Kaashoek. CPHASH: A Cache-partitioned Hash Table. PPoPP 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Maged M Michael. High performance dynamic lock-free hash tables and list-based sets. SPAA 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Maged M Michael. Hazard pointers: Safe memory reclamation for lock-free objects. Parallel and Distributed Systems, IEEE Transactions on, 15(6):491--504, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Aravind Natarajan and Neeraj Mittal. Fast Concurrent Lock-free Binary Search Trees. PPoPP 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani. Scaling Memcache at Facebook. NSDI 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Oracle. CopyOnWriteArrayList in Java docs. http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/CopyOnWriteArrayList.html.Google ScholarGoogle Scholar
  49. William Pugh. Concurrent Maintenance of Skip Lists. Technical report, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Hakan Sundell and Philippas Tsigas. Fast and lock-free concurrent priority queues for multi-thread systems. IPDPS 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Tilera. Tilera TILE-Gx. http://www.tilera.com/products/processors/TILE-Gx_Family.Google ScholarGoogle Scholar
  52. Josh Triplett, Paul E McKenney, and Jonathan Walpole. Re-sizable, scalable, concurrent hash tables via relativistic programming. USENIX ATC 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. John D Valois. Lock-free linked lists using compare-and-swap. PODC 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Asynchronized Concurrency: The Secret to Scaling Concurrent Search Data Structures

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASPLOS '15: Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems
        March 2015
        720 pages
        ISBN:9781450328357
        DOI:10.1145/2694344

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 14 March 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ASPLOS '15 Paper Acceptance Rate48of287submissions,17%Overall Acceptance Rate535of2,713submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader