ABSTRACT
Message-passing is an attractive thread coordination mechanism because it cleanly delineates points in an execution when threads communicate, and unifies synchronization and communication: a sender is allowed to proceed only when a receiver willing to accept the data being sent is available and vice versa. To enable greater performance, however, asynchronous or non-blocking extensions are usually provided that allow senders and receivers to proceed even if a matching partner is unavailable. Lightweight threads with synchronous message-passing can be used to encapsulate asynchronous message-passing operations, although such implementations have greater thread management costs that can negatively impact scalability and performance.
This paper introduces parasitic threads, a novel mechanism for expressing asynchronous computation, that combines the efficiency of a non-declarative solution with the ease of use provided by languages with first-class channels and lightweight threads. A parasitic thread is a lightweight data structure that encapsulates an asynchronous computation using the resources provided by a host thread. Parasitic threads need not execute cooperatively, impose no restrictions on the computations they encapsulate, or the communication actions they perform, and impose no additional burden on thread scheduling mechanisms.
We describe an implementation of parasitic threads in MLton, a whole-program optimizing compiler and runtime for Standard ML. Benchmark results indicate parasitic threads enable construction of scalable and efficient message-passing parallel programs.
- Gul Agha. An Overview of Actor Languages. SIGPLAN Not., 21(10):58--67, 1986. Google ScholarDigital Library
- Kunal Agrawal, Yuxiong He, and Charles E. Leiserson. Adaptive Work Stealing with Parallelism Feedback. In PPoPP, pages 112--120, 2007. Google ScholarDigital Library
- Joe Armstrong, Robert Virding, Claes Wikstrom, and Mike Williams. Concurrent Programming in Erlang. Prentice-Hall, 2nd edition, 1996. Google ScholarDigital Library
- Mark Baker and Bryan Carpenter. Mpj: A proposed java message passing api and environment for high performance computing. In IPDPS, pages 552--559, 2000. Google ScholarDigital Library
- Robert D. Blumofe and Charles E. Leiserson. Scheduling Multithreaded Computations by Work Stealing. J. ACM, 46(5):720--748, 1999. Google ScholarDigital Library
- Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. The Implementation of the Cilk-5 Multithreaded Language. In PLDI, pages 212--223, 1998. Google ScholarDigital Library
- Glasgow Haskell Compiler. http://www.haskell.org/ghc.Google Scholar
- Tim Harris, Simon Marlow, and Simon Peyton Jones. Haskell on a Shared-Memory Multiprocessor. In Haskell Workshop, pages 49--61, 2005. Google ScholarDigital Library
- Wesley M. Johnston, J.R. Paul Hanna, and Richard J. Millar. Advances in Dataflow Programming Languages. ACM Comput. Surv., 36(1):1--34, 2004. Google ScholarDigital Library
- D.A. Kranz, R.H. Halstead, Jr., and E. Mohr. Mul-T: A High-Performance Parallel Lisp. In PLDI, pages 81--90, 1989. Google ScholarDigital Library
- Doug Lea. Concurrent Programming in Java(TM): Design Principles and Pattern. Prentice-Hall, 2nd edition, 1999. Google ScholarDigital Library
- Guodong Li, Michael Delisi, Ganesh Gopalakrishnan, and Robert M. Kirby. Formal Specification of the MPI-2.0 Standard in TLA+. In PPoPP, pages 283--284, 2008. Google ScholarDigital Library
- M. Matthias Felleisen and Dan Friedman. Control operators, the SECD Machine, and the λ-calculus. In Formal Description of Programming Concepts III, pages 193--217, 1986.Google Scholar
- Robin Milner, Mads Tofte, and David Macqueen. The Definition of Standard ML. MIT Press, Cambridge, MA, USA, 1997. Google ScholarDigital Library
- MLton. http://www.mlton.org.Google Scholar
- Eric Mohr, David A. Kranz, and Robert H. Halstead, Jr. Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs. In LFP, pages 185--197, 1990. Google ScholarDigital Library
- Rishiyur Nikhil and Arvind. Implicit Parallel Programming in pH. Morgan-Kaufmann, 2001. Google ScholarDigital Library
- John Reppy and Yingqi Xiao. Specialization of CML Message-Passing Primitives. In POPL, pages 315--326, 2007. Google ScholarDigital Library
- John Reppy and Yingqi Xiao. Towards a Parallel Implementation of Concurrent ML. In DAMP, January 2008.Google Scholar
- John H. Reppy. Concurrent Programming in ML. Cambridge University Press, 1999. Google ScholarDigital Library
- C# Language Specification. http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-334.pdf.Google Scholar
- Sriram Srinivasan and Alan Mycroft. Kilim: Isolation-Typed Actors for Java. In ECOOP, pages 104--128, 2008. Google ScholarDigital Library
- Don Syme, Adam Granicz, and Antonio Cisternino. Expert F#. Apress, 2007.Google Scholar
- Hong Tang and Tao Yang. Optimizing Threaded MPI Execution on SMP Clusters. In ICS, pages 381--392, 2001. Google ScholarDigital Library
- Rob von Behren, Jeremy Condit, Feng Zhou, George Necula, and Eric Brewer. Capriccio: Scalable Threads for Internet Services. In SOSP, pages 268--281, 2003. Google ScholarDigital Library
Index Terms
- Lightweight asynchrony using parasitic threads
Recommendations
Lightweight preemptive user-level threads
PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingMany-to-many mapping models for user- to kernel-level threads (or "M:N threads") have been extensively studied for decades as a lightweight substitute for current Pthreads implementations that provide a simple one-to-one mapping ("1:1 threads"). M:N ...
A lightweight deadlock analysis for programs with threads and reentrant locks
AbstractDeadlock analysis of multi-threaded programs with reentrant locks is complex because these programs may have infinitely many states. We define a simple calculus featuring recursion, threads and synchronizations that guarantee exclusive ...
Using threads in interactive systems: a case study
We describe the results of examining two large research and commercial systems for the ways that they use threads. We used three methods: analysis of macroscopic thread statistics, analysis the microsecond spacing between thread events, and reading the ...
Comments