skip to main content
10.1145/3136000.3136002acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Making collection operations optimal with aggressive JIT compilation

Published:22 October 2017Publication History

ABSTRACT

Functional collection combinators are a neat and widely accepted data processing abstraction. However, their generic nature results in high abstraction overheads -- Scala collections are known to be notoriously slow for typical tasks. We show that proper optimizations in a JIT compiler can widely eliminate overheads imposed by these abstractions. Using the open-source Graal JIT compiler, we achieve speedups of up to 20x on collection workloads compared to the standard HotSpot C2 compiler. Consequently, a sufficiently aggressive JIT compiler allows the language compiler, such as Scalac, to focus on other concerns.

In this paper, we show how optimizations, such as inlining, polymorphic inlining, and partial escape analysis, are combined in Graal to produce collections code that is optimal with respect to manually written code, or close to optimal. We argue why some of these optimizations are more effectively done by a JIT compiler. We then identify specific use-cases that most current JIT compilers do not optimize well, warranting special treatment from the language compiler.

References

  1. Matthew Arnold, Stephen Fink, Vivek Sarkar, and Peter F. Sweeney. 2000. A Comparative Study of Static and Profile-based Heuristics for Inlining. In Proceedings of the ACM SIGPLAN Workshop on Dynamic and Adaptive Compilation and Optimization (DYNAMO ’00) . ACM, New York, NY, USA, 52–64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Andrew Ayers, Richard Schooler, and Robert Gottlieb. 1997. Aggressive Inlining. In Proceedings of the ACM SIGPLAN 1997 Conference on Programming Language Design and Implementation (PLDI ’97) . ACM, New York, NY, USA, 134–145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yosi Ben Asher, Omer Boehm, Daniel Citron, Gadi Haber, Moshe Klausner, Roy Levin, and Yousef Shajrawi. 2008. Aggressive Function Inlining: Preventing Loop Blockings in the Instruction Cache . Springer Berlin Heidelberg, Berlin, Heidelberg, 384–397. Google ScholarGoogle ScholarCross RefCross Ref
  4. Aggelos Biboudis and Eugene Burmako. 2014. MorphScala: Safe Class Morphing with Macros. In Proceedings of the Fifth Annual Scala Workshop (SCALA ’14) . ACM, New York, NY, USA, 18–22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Aggelos Biboudis, Nick Palladinos, and Yannis Smaragdakis. 2014. Clash of the Lambdas. CoRR abs/1406.6631 (2014). http://arxiv.org/abs/1406.6631Google ScholarGoogle Scholar
  6. Richard Bird and Philip Wadler. 1988. An Introduction to Functional Programming . Prentice Hall International (UK) Ltd., Hertfordshire, UK, UK.Google ScholarGoogle Scholar
  7. Regis Blanc. 2015. CafeSAT. (2015). https://github.com/regb/cafesat.Google ScholarGoogle Scholar
  8. Gilad Bracha, Sun Microsystems, Norman Cohen Ibm, Christian Kemper Inprise, Martin Odersky Epfl, David Stoutamire, and Sun Microsystems. 2003. Adding generics to the java programming language: Public draft specification, version 2.0 . Technical Report.Google ScholarGoogle Scholar
  9. Eugene Burmako. 2013. Scala Macros: Let Our Powers Combine!: On How Rich Syntax and Static Types Work with Metaprogramming. In Proceedings of the 4th Workshop on Scala (SCALA ’13) . ACM, New York, NY, USA, Article 3, 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Pohua P. Chang, Scott A. Mahlke, William Y. Chen, and Wen-mei W. Hwu. 1992. Profile-guided Automatic Inline Expansion for C Programs. Softw. Pract. Exper. 22, 5 (May 1992), 349–369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Cliff Click. 1995. Global Code Motion/Global Value Numbering. In Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation (PLDI ’95) . ACM, New York, NY, USA, 246– 257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Cliff Click and Michael Paleczny. 1995. A Simple Graph-based Intermediate Representation. In Papers from the 1995 ACM SIGPLAN Workshop on Intermediate Representations (IR ’95) . ACM, New York, NY, USA, 35–49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. Peter Deutsch and Allan M. Schiffman. 1984. Efficient Implementation of the Smalltalk-80 System. In Proceedings of the 11th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL ’84) . ACM, New York, NY, USA, 297–302. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Iulian Dragos and Martin Odersky. 2009. Compiling Generics Through Userdirected Type Specialization. In Proceedings of the 4th Workshop on the Implementation, Compilation, Optimization of Object-Oriented Languages and Programming Systems (ICOOOLPS ’09) . ACM, New York, NY, USA, 42–47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gilles Duboscq, Thomas Würthinger, and Hanspeter Mössenböck. 2014. Speculation Without Regret: Reducing Deoptimization Meta-data in the Graal Compiler. In Proceedings of the 2014 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools (PPPJ ’14) . ACM, New York, NY, USA, 187–193. Google ScholarGoogle ScholarCross RefCross Ref
  16. Gilles Duboscq, Thomas Würthinger, Lukas Stadler, Christian Wimmer, Doug Simon, and Hanspeter Mössenböck. 2013. An Intermediate Representation for Speculative Optimizations in a Dynamic Compiler. In Proceedings of the 7th ACM Workshop on Virtual Machines and Intermediate Languages (VMIL ’13) . ACM, New York, NY, USA, 1–10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Andy Georges, Dries Buytaert, and Lieven Eeckhout. 2007. Statistically Rigorous Java Performance Evaluation. In Proceedings of the 22Nd Annual ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications (OOPSLA ’07) . ACM, New York, NY, USA, 57–76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Thomas Kotzmann and Hanspeter Mössenböck. 2005. Escape Analysis in the Context of Dynamic Compilation and Deoptimization. In Proceedings of the 1st ACM/USENIX International Conference on Virtual Execution Environments (VEE ’05) . ACM, New York, NY, USA, 111–120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Martin Odersky and Adriaan Moors. 2009. Fighting bit Rot with Types (Experience Report: Scala Collections). In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2009) (Leibniz International Proceedings in Informatics (LIPIcs)) , Ravi Kannan and K Narayan Kumar (Eds.), Vol. 4. Schloss Dagstuhl– Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 427–451. Google ScholarGoogle ScholarCross RefCross Ref
  20. Simon Peyton Jones and Simon Marlow. 2002. Secrets of the Glasgow Haskell Compiler Inliner. J. Funct. Program. 12, 5 (July 2002), 393–434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Aleksandar Prokopec. 2015. SnapQueue: Lock-Free Queue with Constant Time Snapshots (Scala ’15). Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Aleksandar Prokopec, Phil Bagwell, Tiark Rompf, and Martin Odersky. 2011. A Generic Parallel Collection Framework. In Proceedings of the 17th International Conference on Parallel Processing - Volume Part II (EuroPar’11) . Springer-Verlag, Berlin, Heidelberg, 136–147. http://dl.acm.org/ citation.cfm?id=2033408.2033425 Google ScholarGoogle ScholarCross RefCross Ref
  23. Aleksandar Prokopec, Nathan Grasso Bronson, Phil Bagwell, and Martin Odersky. 2012. Concurrent Tries with Efficient Non-blocking Snapshots. (2012), 151–160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Aleksandar Prokopec, Philipp Haller, and Martin Odersky. 2014. Containers and Aggregates, Mutators and Isolates for Reactive Programming. In Proceedings of the Fifth Annual Scala Workshop (SCALA ’14) . ACM, 51–61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Aleksandar Prokopec, Heather Miller, Tobias Schlatter, Philipp Haller, and Martin Odersky. 2012. FlowPools: A Lock-Free Deterministic Concurrent Dataflow Abstraction. In LCPC. 158–173.Google ScholarGoogle Scholar
  26. Aleksandar Prokopec and Martin Odersky. 2016. Conc-Trees for Functional and Parallel Programming . Springer International Publishing, Cham, 254–268. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Aleksandar Prokopec and Dmitry Petrashko. 2013. ScalaBlitz Documentation. (2013). http://scala-blitz.github.io/home/documentation/Google ScholarGoogle Scholar
  28. Aleksandar Prokopec, Dmitry Petrashko, and Martin Odersky. 2014. On Lock-Free Work-stealing Iterators for Parallel Data Structures . Technical Report.Google ScholarGoogle Scholar
  29. Aleksandar Prokopec, Dmitry Petrashko, and Martin Odersky. 2015. Efficient Lock-Free Work-Stealing Iterators for Data-Parallel Collections. In 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing . 248–252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Lukas Stadler, Gilles Duboscq, Hanspeter Mössenböck, Thomas Würthinger, and Doug Simon. 2013. An Experimental Study of the Influence of Dynamic Compiler Optimizations on Scala Performance. In Proceedings of the 4th Workshop on Scala (SCALA ’13) . ACM, New York, NY, USA, Article 9, 8 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Lukas Stadler, Thomas Würthinger, and Hanspeter Mössenböck. 2014. Partial Escape Analysis and Scalar Replacement for Java. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO ’14) . ACM, New York, NY, USA, Article 165, 10 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Nicolas Stucki, Tiark Rompf, Vlad Ureche, and Phil Bagwell. 2015. RRB Vector: A Practical General Purpose Immutable Sequence. SIGPLAN Not. 50, 9 (Aug. 2015), 342–354. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Bjorn De Sutter, Frank Tip, and Julian Dolby. 2004. Customization of Java Library Classes Using Type Constraints and Profile Information. In ECOOP 2004 - Object-Oriented Programming, 18th European Conference, Oslo, Norway, June 14-18, 2004, Proceedings . 585–610. Google ScholarGoogle ScholarCross RefCross Ref
  34. Vlad Ureche, Cristian Talau, and Martin Odersky. 2013. Miniboxing: Improving the Speed to Code Size Tradeoff in Parametric Polymorphism Translations. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA ’13) . ACM, New York, NY, USA, 73–92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Christian Wimmer. 2008. Automatic Object Inlining in a Java Virtual Machine. Trauner.Google ScholarGoogle Scholar
  36. Thomas Würthinger, Andreas Wöß, Lukas Stadler, Gilles Duboscq, Doug Simon, and Christian Wimmer. 2012. Self-optimizing AST Interpreters. In Proceedings of the 8th Symposium on Dynamic Languages (DLS ’12) . ACM, New York, NY, USA, 73–82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient Distributed Datasets: A Fault-tolerant Abstraction for Inmemory Cluster Computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI’12) . USENIX Association, Berkeley, CA, USA, 2–2. http://dl.acm.org/citation.cfm?id= 2228298.2228301Google ScholarGoogle Scholar

Index Terms

  1. Making collection operations optimal with aggressive JIT compilation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader