ABSTRACT
Stream processing is mainstream (again): Widely-used stream libraries are now available for virtually all modern OO and functional languages, from Java to C# to Scala to OCaml to Haskell. Yet expressivity and performance are still lacking. For instance, the popular, well-optimized Java 8 streams do not support the zip operator and are still an order of magnitude slower than hand-written loops.
We present the first approach that represents the full generality of stream processing and eliminates overheads, via the use of staging. It is based on an unusually rich semantic model of stream interaction. We support any combination of zipping, nesting (or flat-mapping), sub-ranging, filtering, mapping—of finite or infinite streams. Our model captures idiosyncrasies that a programmer uses in optimizing stream pipelines, such as rate differences and the choice of a “for” vs. “while” loops. Our approach delivers hand-written–like code, but automatically. It explicitly avoids the reliance on black-box optimizers and sufficiently-smart compilers, offering highest, guaranteed and portable performance.
Our approach relies on high-level concepts that are then readily mapped into an implementation. Accordingly, we have two distinct implementations: an OCaml stream library, staged via MetaOCaml, and a Scala library for the JVM, staged via LMS. In both cases, we derive libraries richer and simultaneously many tens of times faster than past work. We greatly exceed in performance the standard stream libraries available in Java, Scala and OCaml, including the well-optimized Java 8 streams.
- Reactive extensions, 2016.Google Scholar
- A. Biboudis, N. Palladinos, and Y. Smaragdakis. Clash of the Lambdas. arXiv preprint arXiv:1406.6631, 9th International Workshop on Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems, 2014.Google Scholar
- A. Biboudis, N. Palladinos, G. Fourtounis, and Y. Smaragdakis. Streams a la carte: Extensible Pipelines with Object Algebras. In 29th European Conference on Object-Oriented Programming (ECOOP 2015), volume 37, pages 591–613, 2015. ISBN 978-3-939897-86-6.Google Scholar
- A. Bondorf. Improving binding times without explicit CPSconversion. In Lisp & Functional Programming, pages 1–10, 1992. Google ScholarDigital Library
- D. Coutts, R. Leshchinskiy, and D. Stewart. Stream fusion: From lists to streams to nothing at all. In Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming, ICFP ’07, pages 315–326, New York, NY, USA, 2007. ACM. ISBN 978- 1-59593-815-2. Google ScholarDigital Library
- A. Farmer, A. Gill, E. Komp, and N. Sculthorpe. The HERMIT in the Machine: A Plugin for the Interactive Transformation of GHC Core Language Programs. In Proceedings of the 2012 Haskell Symposium, Haskell ’12, pages 1–12, New York, NY, USA, 2012. ACM. ISBN 978-1-4503-1574-6. Google ScholarDigital Library
- A. Farmer, C. Hoener zu Siederdissen, and A. Gill. The HERMIT in the Stream: Fusing Stream Fusion’s concatMap. In Proceedings of the ACM SIGPLAN 2014 Workshop on Partial Evaluation and Program Manipulation, PEPM ’14, pages 97–108, New York, NY, USA, 2014. Google ScholarDigital Library
- ACM. ISBN 978-1-4503-2619-3.Google Scholar
- J. Gibbons and G. Jones. The under-appreciated unfold. In ICFP ’98: Proceedings of the ACM International Conference on Functional Programming, volume 34(1), pages 273–279, New York, Sept. 1998. Google ScholarDigital Library
- ACM Press.Google Scholar
- A. Gill, J. Launchbury, and S. L. Peyton Jones. A short cut to deforestation. In Proceedings of the Conference on Functional Programming Languages and Computer Architecture, FPCA ’93, pages 223–232, New York, NY, USA, 1993. ACM. ISBN 0-89791-595-X. Google ScholarDigital Library
- N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous data flow programming language LUSTRE. Proceedings of the IEEE, 79(9):1305–1320, 1991.Google ScholarCross Ref
- J. Inoue and W. Taha. Reasoning about multi-stage programs. In ESOP, volume 7211 of Lecture Notes in Computer Science, pages 357–376. Springer, 2012. Google ScholarDigital Library
- M. Jonnalagedda and S. Stucki. Fold-based Fusion As a Library: A Generative Programming Pearl. In Proceedings of the 6th ACM SIGPLAN Symposium on Scala, SCALA 2015, pages 41–50, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-3626-0. Google ScholarDigital Library
- G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, and B. Lippmeier. Regular, shape-polymorphic, parallel arrays in Haskell. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming, ICFP ’10, pages 261–272, New York, NY, USA, 2010. ACM. ISBN 978-1-60558-794-3. Google ScholarDigital Library
- R. Kelsey and P. Hudak. Realistic compilation by program transformation (detailed summary). In Proceedings of the 16th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL ’89, pages 281–292, New York, NY, USA, 1989. ACM. ISBN 0- 89791-294-2. Google ScholarDigital Library
- P. Khuong. Introducing pipes, a lightweight stream fusion edsl, 2011.Google Scholar
- O. Kiselyov. Iteratees. In FLOPS, volume 7294 of LNCS, pages 166– 181. Springer, 2012. Google ScholarDigital Library
- O. Kiselyov. The Design and Implementation of BER MetaOCaml. In Functional and Logic Programming, pages 86–102. Springer, 2014.Google Scholar
- B. Lippmeier, M. M. Chakravarty, G. Keller, and A. Robinson. Data flow fusion with series expressions in Haskell. In Proceedings of the 2013 ACM SIGPLAN Symposium on Haskell, Haskell ’13, pages 93– 104, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2383-3. Google ScholarDigital Library
- G. Mainland, R. Leshchinskiy, and S. Peyton Jones. Exploiting vector instructions with generalized stream fusion. In Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming, ICFP ’13, pages 37–48, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-2326-0. Google ScholarDigital Library
- E. Meijer, M. Fokkinga, and R. Paterson. Functional programming with bananas, lenses, envelopes and barbed wire. In J. Hughes, editor, Functional Programming Languages and Computer Architecture: 5th Conference, number 523 in Lecture Notes in Computer Science, pages 124–144, Berlin, 1991. The Association for Computing Machinery, Springer. Google ScholarDigital Library
- D. G. Murray, M. Isard, and Y. Yu. Steno: automatic optimization of declarative queries. In ACM SIGPLAN Notices, volume 46, pages 121–131. ACM, 2011. Google ScholarDigital Library
- N. Palladinos and K. Rontogiannis. LinqOptimizer: An automatic query optimizer for LINQ to Objects and PLINQ. Technical report, Nessos Information Technologies S.A., 2013.Google Scholar
- S. Peyton Jones, A. Tolmach, and T. Hoare. Playing by the rules: rewriting as a practical optimisation technique in GHC. In Haskell workshop, volume 1, pages 203–233, 2001.Google Scholar
- M. Pouzet. Lucid synchrone, version 3. Tutorial and reference manual. Université Paris-Sud, LRI, 2006.Google Scholar
- A. Prokopec and D. Petrashko. ScalaBlitz: Lightning-fast Scala collections framework. Technical report, LAMP Scala Team, EPFL, 2013.Google Scholar
- T. Rompf and M. Odersky. Lightweight modular staging: A pragmatic approach to runtime code generation and compiled dsls. Commun. ACM, 55(6):121–130, June 2012. ISSN 0001-0782. Google ScholarDigital Library
- Google ScholarDigital Library
- T. Rompf, A. K. Sujeeth, N. Amin, K. J. Brown, V. Jovanovic, H. Lee, M. Jonnalagedda, K. Olukotun, and M. Odersky. Optimizing data structures in high-level programs: New directions for extensible compilers based on staging. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’13, pages 497–510, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-1832-7. Google ScholarDigital Library
- M. Shaw, W. A. Wulf, and R. L. London. Abstraction and verification in Alphard: defining and specifying iteration and generators. Communications of the ACM, 20(8):553–564, 1977. Google Scholar
- A. Shipilev, S. Kuksenko, A. Astrand, S. Friberg, and H. Loef. OpenJDK: jmh.Google ScholarDigital Library
- M. H. B. Sørensen, R. Glück, and N. D. Jones. Towards unifying deforestation, supercompilation, partial evaluation, and generalized partial computation. In D. Sannella, editor, Programming Languages and Systems: Proceedings of ESOP'94, 5th European Symposium on Programming, number 788 in Lecture Notes in Computer Science, pages 485-500, Berlin, 11-13 Apr. 1994. Springer. Google ScholarDigital Library
- G. Stewart, M. Gowda, G. Mainland, B. Radunovic, D. Vytiniotis, and C. L. Agullo. Ziria: A DSL for wireless systems programming. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’15, pages 415–428, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-2835-7. Google ScholarDigital Library
- J. Svenningsson. Shortcut fusion for accumulating parameters & ziplike functions. In Proceedings of the Seventh ACM SIGPLAN International Conference on Functional Programming, ICFP ’02, pages 124–132, New York, NY, USA, 2002. ACM. ISBN 1-58113-487- 8. Google ScholarDigital Library
- B. J. Svensson and J. Svenningsson. Defunctionalizing Push Arrays. In Proceedings of the 3rd ACM SIGPLAN Workshop on Functional High-performance Computing, FHPC ’14, pages 43–52, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-3040-4. Google Scholar
- W. Taha. A Gentle Introduction to Multi-stage Programming. In C. Lengauer, D. Batory, C. Consel, and M. Odersky, editors, Domain-Specific Program Generation, number 3016 in Lecture Notes in Computer Science, pages 30–50. Springer Berlin Heidelberg, 2004. ISBN 978-3-540-22119-7 978-3-540-25935-0.Google ScholarDigital Library
- P. L. Wadler. Deforestation: Transforming programs to eliminate trees. Theoretical Computer Science, 73(2):231–248, June 1990. Google Scholar
- R. C. Waters. User manual for the series macro package. MIT AI Memo 1082, 1989.Google ScholarDigital Library
- R. C. Waters. Automatic transformation of series expressions into loops. ACM Trans. Program. Lang. Syst., 13(1):52–98, Jan. 1991. ISSN 0164-0925. Google Scholar
Index Terms
- Stream fusion, to completeness
Recommendations
Stream fusion, to completeness
POPL '17Stream processing is mainstream (again): Widely-used stream libraries are now available for virtually all modern OO and functional languages, from Java to C# to Scala to OCaml to Haskell. Yet expressivity and performance are still lacking. For instance,...
Forge: generating a high performance DSL implementation from a declarative specification
GPCE '13Domain-specific languages provide a promising path to automatically compile high-level code to parallel, heterogeneous, and distributed hardware. However, in practice high performance DSLs still require considerable software expertise to develop and ...
Complete Stream Fusion for Software-Defined Radio
PEPM 2024: Proceedings of the 2024 ACM SIGPLAN International Workshop on Partial Evaluation and Program ManipulationStrymonas is a code-generation--based library (embedded DSL) for fast, bulk, single-thread in-memory stream processing -- with the declarative description of stream pipelines and yet achieving the speed and memory efficiency of hand-written state ...
Comments