Abstract
Field programmable gate arrays (FPGAs) are fundamentally different to fixed processors architectures because their memory hierarchies can be tailored to the needs of an algorithm. FPGA compilers for high level languages are not hindered by fixed memory hierarchies. The constraint when compiling to FPGAs is the availability of resources.
In this paper we describe how the dataflow intermediary of our declarative FPGA image processing DSL called RIPL (Rathlin Image Processing Language) enables us to constrain memory. We use five benchmarks to demonstrate that memory use with RIPL is comparable to the Vivado HLS OpenCV library without the need for language pragmas to guide hardware synthesis. The benchmarks also show that RIPL is more expressive than the Darkroom FPGA image processing language.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Bezati, E.: High-level synthesis of dataflow programs for heterogeneous platforms. Ph.D. thesis, STI, EPFL, Switzerland (2015)
Bradski, G.R., Kaehler, A.: Learning OpenCV - Computer Vision with the OpenCV library: Software that Sees. O’Reilly, Beijing (2008)
Cole, M.: Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge (1991)
DeVito, Z., Hegarty, J., Aiken, A., Hanrahan, P., Vitek, J.: Terra: a multi-stage language for high-performance computing. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, Seattle, WA, USA, June 16–19, 2013, pp. 105–116. ACM (2013)
Hegarty, J., Brunhaver, J., DeVito, Z., Ragan-Kelley, J., Cohen, N., Bell, S., Vasilyev, A., Horowitz, M., Hanrahan, P.: Darkroom: compiling high-level image processing code into hardware pipelines. ACM Trans. Graph. 33(4), 1–11 (2014)
Kennedy, K., McKinley, K.S.: Maximizing loop parallelism and improving data locality via loop fusion and distribution. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds.) LCPC 1993. LNCS, vol. 768, pp. 301–320. Springer, Heidelberg (1994). doi:10.1007/3-540-57659-2_18
Kiselyov, O.: Iteratee IO: Safe, Practical, Declarative Input Processing. In: 11th International Symposium on Functional and Logic Programming. LNCS, vol. 7294, pp. 166–181 (2012)
Lee, H., Brown, K.J., Sujeeth, A.K., Rompf, T., Olukotun, K.: locality-aware mapping of nested parallel patterns on GPUs. In: 47th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2014, Cambridge, UK, December 13–17, 2014, pp. 63–74. IEEE (2014)
Muddukrishna, A., Jonsson, P.A., Brorsson, M.: Locality-aware task scheduling and data distribution for openmp programs on NUMA systems and manycore processors. Sci. Program. 2015, 981759: 1–981759: 16 (2015)
Stephen Neuendorffer, T.L., Wang, D.: Accelerating OpenCV Applications with Zynq-7000 All Programmable SoC using Vivado HLS Video Libraries. Technical report, Xilinx, June 2015
Tate, A., et al.: Programming abstractions for data locality. In: Workshop on Programming Abstractions for Data Locality, Swiss National Supercomputing Center, Lugano, Switzerland, April 2014
Wieser, V., Grelck, C., Haslinger, P., Guo, J., Korzeniowski, F., Bernecky, R., Moser, B., Scholz, S.: Combining high productivity and high performance in image processing using Single Assignment C on multi-core CPUs and many-core GPUs. J. Electron. Imaging 21(2), 21116 (2012)
Xilinx: Implementing Memory Structures for Video Processing in the Vivado HLS Tool. Technical report, Xilinx, September 2012
Acknowledgements
We acknowledge the support of the Engineering and Physical Research Council, grant reference EP/K009931/1 (Programmable embedded platforms for remote and compute intensive image processing applications).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Stewart, R., Michaelson, G., Bhowmik, D., Garcia, P., Wallace, A. (2016). A Dataflow IR for Memory Efficient RIPL Compilation to FPGAs. In: Carretero, J., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2016. Lecture Notes in Computer Science(), vol 10049. Springer, Cham. https://doi.org/10.1007/978-3-319-49956-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-49956-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49955-0
Online ISBN: 978-3-319-49956-7
eBook Packages: Computer ScienceComputer Science (R0)