research-article

Language virtualization for heterogeneous parallel computing

Authors:
Hassan Chafi

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Zach DeVito

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Adriaan Moors

EPFL, Lausanne, Switzerland

EPFL, Lausanne, Switzerland
View Profile

,
Tiark Rompf

EPFL, Lausanne, Switzerland

EPFL, Lausanne, Switzerland
View Profile

,
Arvind K. Sujeeth

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Pat Hanrahan

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

,
Martin Odersky

EPFL, Lausanne, Switzerland

EPFL, Lausanne, Switzerland
View Profile

,
Kunle Olukotun

Stanford University, Stanford, CA, USA

Stanford University, Stanford, CA, USA
View Profile

Authors Info & Claims

ACM SIGPLAN Notices Volume 45 Issue 10October 2010pp 835–847https://doi.org/10.1145/1932682.1869527

Published:17 October 2010Publication History

ACM SIGPLAN Notices

Abstract

As heterogeneous parallel systems become dominant, application developers are being forced to turn to an incompatiblemix of low level programming models (e.g. OpenMP, MPI, CUDA, OpenCL). However, these models do little to shield developers from the difficult problems of parallelization, data decomposition and machine-specific details. Most programmersare having a difficult time using these programming models effectively. To provide a programming modelthat addresses the productivity and performance requirements for the average programmer, we explore a domainspecificapproach to heterogeneous parallel programming.

We propose language virtualization as a new principle that enables the construction of highly efficient parallel domain specific languages that are embedded in a common host language. We define criteria for language virtualization and present techniques to achieve them.We present two concrete case studies of domain-specific languages that are implemented using our virtualization approach.

References

}}Scala. http://www.scala-lang.org.Google Scholar
}}AMD. The Industry-Changing Impact of Accelerated Computing. Website. http://sites.amd.com/us/Documents/AMD_fusion_Whitepaper.pdf.Google Scholar
}}S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. Efficient Management of Parallelism in Object Oriented Numerical Software Libraries. In E. Arge, A. M. Bruaset, and H. P. Langtangen, editors, Modern Software Tools in Scientific Computing, pages 163--202. Birkhäuser Press, 1997. Google ScholarDigital Library
}}J. Bentley. Programming pearls: little languages. Commun. ACM, 29(8):711--721, 1986. Google ScholarDigital Library
}}G. E. Blelloch and J. Greiner. A Provable Time and Space Efficient Implementation of NESL. In ACM SIGPLAN International Conference on Functional Programming, pages 213--225, May 1996. Google ScholarDigital Library
}}D. L. Brown, W. D. Henshaw, and D. J. Quinlan. Overture: An object-oriented framework for solving partial differential equations on overlapping grids. In SIAM conference on Object Oriented Methods for Scientfic Computing, volume UCRL-JC-132017, 1999. Google ScholarDigital Library
}}C. Calvert and D. Kulkarni. Essential LINQ. Addison-Wesley Professional, 2009. Google ScholarDigital Library
}}J. Carette, O. Kiselyov, and C. chieh Shan. Finally tagless, partially evaluated. In Z. Shao, editor, APLAS, volume 4807 of Lecture Notes in Computer Science, pages 222--238. Springer, 2007. Google ScholarDigital Library
}}S. Chakradhar, A. Raghunathan, and J. Meng. Best-effort parallel execution framework for recognition and mining applications. In Proc. of the 23rd Annual Int'l Symp. on Parallel and Distributed Processing (IPDPS'09), pages 1--12, 2009. Google ScholarDigital Library
}}B. L. Chamberlain, D. Callahan, and H. P. Zima. Parallel programmability and the chapel language. IJHPCA, 21(3):291--312, 2007. Google ScholarDigital Library
}}E. Chow, A. Cleary, and R. Falgout. Design of the hypre Preconditioner Library. In M. Henderson, C. Anderson, and S. Lyons, editors, SIAM Workshop on Object Oriented Methods for Inter-operable Scientific and Engineering Computing, pages 21--23, 1998.Google Scholar
}}C.-T. Chu, S. K. Kim, Y.-A. Lin, Y. Yu, G. Bradski, A. Y. Ng, and K. Olukotun. Map-reduce for machine learning on multicore. In NIPS '06, pages 281--288, 2006.Google Scholar
}}J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI, pages 137--150, 2004. Google ScholarDigital Library
}}C. Elliott, S. Finne, and O. De Moor. Compiling embedded languages. Journal of Functional Programming, 13(03):455--481, 2003. Google ScholarDigital Library
}}J. M. et. al. SISAL: Streams and iterators in a single assignment language, language reference manual. Technical Report M-146, Lawrence Livermore National Laboratory, March 1985.Google Scholar
}}B. Feigin and A. Mycroft. Jones optimality and hardware virtualization: a report on work in progress. In PEPM, pages 169--175, 2008. Google ScholarDigital Library
}}M. Frigo. A fast fourier transform compiler. In PLDI, pages 169--180, 1999. Google ScholarDigital Library
}}S. Gorlatch. Send-receive considered harmful: myths and realities of message passing. ACM Trans. Program. Lang. Syst., 26(1):47--56, 2004. Google ScholarDigital Library
}}H. P. Graf, E. Cosatto, L. Bottou, I. Durdanovic, and V. Vapnik. Parallel support vector machines: The cascade svm. In NIPS ’04, 2004.Google Scholar
}}M. Guerrero, E. Pizzi, R. Rosenbaum, K. Swadi, and W. Taha. Implementing DSLs in metaOCaml. In OOPSLA '04: Companion to the 19th annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, pages 41--42, New York, NY, USA, 2004. ACM. Google ScholarDigital Library
}}M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. Williams, and K. S. Stanley. An overview of the Trilinos project. ACM Trans. Math. Softw., 31(3):397--423, 2005. Google ScholarDigital Library
}}C. Hofer, K. Ostermann, T. Rendel, and A. Moors. Polymorphic embedding of dsls. In Y. Smaragdakis and J. G. Siek, editors, GPCE, pages 137--148. ACM, 2008. Google ScholarDigital Library
}}P. Hudak. Modular domain specific languages and tools. In Software Reuse, 1998. Proceedings. Fifth International Conference on, pages 134--142, 1998. Google ScholarDigital Library
}}Intel. From a Few Cores to Many: A Tera-scale Computing Research Review. Website. http://download.intel.com/research/platform/terascale/terascale_overvie%w_paper.pdf.Google Scholar
}}M. Irwin and J. Shen, editors. Revitalizing Computer Architecture Research. Computing Research Association, dec 2005.Google Scholar
}}S. L. P. Jones, R. Leshchinskiy, G. Keller, and M. M. T. Chakravarty. Harnessing the Multicores: Nested Data Parallelism in Haskell. In R. Hariharan, M. Mukund, and V. Vinay, editors, FSTTCS, volume 2 of LIPIcs, pages 383--414. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2008.Google Scholar
}}G. L. S. Jr. Parallel programming and parallel abstractions in fortress. In IEEE PACT, page 157. IEEE Computer Society, 2005. Google ScholarDigital Library
}}G. Karypis and V. Kumar. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel Distrib. Comput., 48(1):71--95, 1998. Google ScholarDigital Library
}}K. Kennedy, B. Broom, A. Chauhan, R. Fowler, J. Garvin, C. Koelbel, C. McCosh, and J. Mellor-Crummey. Telescoping languages: A system for automatic generation of domain languages. Proceedings of the IEEE, 93(3):387--408, 2005. This provides a current overview of the entire Telescoping Languages Project.Google ScholarCross Ref
}}D. Leijen and E. Meijer. Domain specific embedded compilers. In DSL: Proceedings of the 2 nd conference on Domain-specific languages: Austin, Texas, United States. Association for Computing Machinery, Inc, One Astor Plaza, 1515 Broadway, New York, NY, 10036--5701, USA,, 1999. Google ScholarDigital Library
}}M. Odersky and M. Zenger. Scalable component abstractions. In R. E. Johnson and R. P. Gabriel, editors, OOPSLA, pages 41--57. ACM, 2005. Google ScholarDigital Library
}}K. Olukotun, B. A. Nayfeh, L. Hammond, K. G. Wilson, and K. Chang. The case for a single-chip multiprocessor. In ASPLOS '96. Google ScholarDigital Library
}}E. Pasalic, W. Taha, and T. Sheard. Tagless staged interpreters for typed languages. SIGPLAN Not., 37(9):218--229, 2002. Google ScholarDigital Library
}}S. Peyton Jones, D. Vytiniotis, S. Weirich, and G. Washburn. Simple unification-based type inference for GADTs. SIGPLAN Not., 41(9):50--61, 2006. Google ScholarDigital Library
}}M. Püschel, J. M. F. Moura, B. Singer, J. Xiong, J. Johnson, D. A. Padua, M. M. Veloso, and R. W. Johnson. Spiral: A generator for platform-adapted libraries of signal processing alogorithms. IJHPCA, 18(1):21--45, 2004. Google ScholarDigital Library
}}D. Quinlan and R. Parsons. A P array classes for architecture independent finite differences computations. In ONNSKI, 1994.Google Scholar
}}D. J. Quinlan, B. Miller, B. Philip, and M. Schordan. Treating a user-defined parallel library as a domain-specific language. In IPDPS. IEEE Computer Society, 2002. Google ScholarDigital Library
}}J. V. W. Reynders, P. J. Hinker, J. C. Cummings, S. R. Atlas, S. Banerjee, W. F. Humphrey, K. Keahey, M. Srikant, and M. Tholburn. POOMA: A Framework for Scientific Simulation on Parallel Architectures, 1996.Google Scholar
}}T. Rompf and M. Odersky. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. In GPCE, 2010. Google ScholarDigital Library
}}V. A. Saraswat. X10: Concurrent programming for modern architectures. In APLAS, page 1, 2007. Google ScholarDigital Library
}}S.-B. Scholz. Single Assignment C: efficient support for high-level array operations in a functional setting. J. Funct. Program., 13(6):1005--1059, 2003. Google ScholarDigital Library
}}T. Schrijvers, S. Peyton Jones, M. Sulzmann, and D. Vytiniotis. Complete and decidable type inference for GADTs. In ICFP '09: Proceedings of the 14th ACM SIGPLAN international conference on Functional programming, pages 341--352, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
}}T. Sheard and S. Jones. Template meta-programming for Haskell. ACM SIGPLAN Notices, 37(12):60--75, 2002. Google ScholarDigital Library
}}G. C. Sih and E. A. Lee. A compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures. IEEE Trans. Parallel Distrib. Syst., 4(2):175--187, 1993. Google ScholarDigital Library
}}G. L. Steele. Common Lisp the Language. Digital Press, Billerica, MA, 1984. Google ScholarDigital Library
}}J. R. Stewart and H. C. Edwards. A framework approach for developing parallel adaptive multiphysics applications. Finite Elem. Anal. Des., 40(12):1599--1617, 2004. Google ScholarDigital Library
}}H. Sutter. The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb's Journal, 30(3), 2005.Google Scholar
}}W. M. Taha. Multistage programming: its theory and applications. PhD thesis, 1999. Supervisor-Sheard, Tim. Google ScholarDigital Library
}}A. van Deursen, P. Klint, and J. Visser. Domain-specific languages: an annotated bibliography. SIGPLAN Not., 35(6):26--36, 2000. Google ScholarDigital Library
}}D. Vandevoorde and N. Josuttis. C templates: the Complete Guide. Addison-Wesley Professional, 2003.Google Scholar
}}T. Veldhuizen. Expression templates, C gems, 1996.Google Scholar
}}T. L. Veldhuizen. Arrays in Blitz. In D. Caromel, R. R. Oldehoeft, and M. Tholburn, editors, ISCOPE, volume 1505 of Lecture Notes in Computer Science, pages 223--230. Springer, 1998. Google ScholarDigital Library
}}T. L. Veldhuizen. Active Libraries and Universal Languages. PhD thesis, Indiana University Computer Science, May 2004. Google ScholarDigital Library
}}R. C. Whaley, A. Petitet, and J. Dongarra. Automated empirical optimizations of software and the ATLAS project. Parallel Computing, 27(1-2):3--35, 2001.Google ScholarDigital Library

Index Terms

Language virtualization for heterogeneous parallel computing
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Source code generation
    2. General programming languages
      1. Language types
        Parallel programming languages

Recommendations

Language virtualization for heterogeneous parallel computing
OOPSLA '10: Proceedings of the ACM international conference on Object oriented programming systems languages and applications

As heterogeneous parallel systems become dominant, application developers are being forced to turn to an incompatiblemix of low level programming models (e.g. OpenMP, MPI, CUDA, OpenCL). However, these models do little to shield developers from the ...
Read More
A domain-specific approach to heterogeneous parallelism
PPoPP '11: Proceedings of the 16th ACM symposium on Principles and practice of parallel programming

Exploiting heterogeneous parallel hardware currently requires mapping application code to multiple disparate programming models. Unfortunately, general-purpose programming models available today can yield high performance but are too low-level to be ...
Read More
A domain-specific approach to heterogeneous parallelism
PPoPP '11

Exploiting heterogeneous parallel hardware currently requires mapping application code to multiple disparate programming models. Unfortunately, general-purpose programming models available today can yield high performance but are too low-level to be ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGPLAN Notices Volume 45, Issue 10
OOPSLA '10
October 2010
957 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1932682
Issue’s Table of Contents
OOPSLA '10: Proceedings of the ACM international conference on Object oriented programming systems languages and applications
October 2010
984 pages
ISBN:9781450302036
DOI:10.1145/1869459
General Chairs:
William R. Cook
University of Texas at Austin
,
Siobhán Clarke
Trinity College Dublin
,
Program Chairs:
Martin Rinard
MIT
,
Kevin J. Sullivan
University of Virginia
,
Daniel H. Steinberg
Dim Sum Thinking Inc.
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 October 2010
Check for updates
Author Tags
domain specific languages
dynamic optimizations
parallel programming
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 64
  Total Citations
  View Citations
- 1,142
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Language virtualization for heterogeneous parallel computing

ACM SIGPLAN Notices

Abstract

References

Cited By

Index Terms

Recommendations

Language virtualization for heterogeneous parallel computing

A domain-specific approach to heterogeneous parallelism

A domain-specific approach to heterogeneous parallelism