skip to main content
10.1145/3642921.3642930acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrapidoConference Proceedingsconference-collections
research-article
Open Access

Using Source-to-Source to Target RISC-V Custom Extensions: UVE Case-Study

Published:06 March 2024Publication History

ABSTRACT

Hardware specialization is seen as a promising venue for improving computing efficiency, with reconfigurable devices as excellent deployment platforms for application-specific architectures. One approach to hardware specialization is via the popular RISC-V, where Instruction Set Architecture (ISA) extensions for domains such as Edge Artifical Intelligence (AI) are already appearing. However, to use the custom instructions while maintaining a high (e.g., C/C++) abstraction level, the assembler and compiler must be modified. Alternatively, inline assembly can be manually introduced by a software developer with expert knowledge of the hardware modifications in the RISC-V core.

In this paper, we consider a RISC-V core with a vectorization and streaming engine to support the Unlimited Vector Extension (UVE), and propose an approach to automatically transform annotated C loops into UVE compatible code, via automatic insertion of inline assembly. We rely on a source-to-source transformation tool, Clava, to perform sophisticated code analysis and transformations via scripts. We use pragmas to identify code sections amenable for vectorization and/or streaming, and use Clava to automatically insert inline UVE instructions, avoiding extensive modifications of existing compiler projects.

We produce UVE binaries which are functionally correct, when compared to handwritten versions with inline assembly, and achieve equal and sometimes improved number of executed instructions, for a set of six benchmarks from the Polybench suite. These initial results are evidence towards that this kind of translation is feasible, and we consider that it is possible in future work to target more complex transformations or other ISA extensions, accelerating the adoption of hardware/software co-design flows for generic application cases.

References

  1. Imad Al Assir, Mohamad El Iskandarani, Hadi Rayan Al Sandid, and Mazen A. R. Saghir. 2021. Arrow: A RISC-V Vector Accelerator for Machine Learning Inference. https://doi.org/10.48550/ARXIV.2107.07169Google ScholarGoogle ScholarCross RefCross Ref
  2. Hansang Bae, Dheya Mustafa, Jae-Woo Lee, Aurangzeb, Hao Lin, Chirag Dave, Rudolf Eigenmann, and Samuel P. Midkiff. 2013. The Cetus Source-to-Source Compiler Infrastructure: Overview and Evaluation. Intl. Journal of Parallel Programming 41, 6 (01 Dec 2013), 753–767. https://doi.org/10.1007/s10766-012-0211-zGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The Gem5 Simulator. SIGARCH Comput. Archit. News 39, 2 (aug 2011), 1–7. https://doi.org/10.1145/2024716.2024718Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. João Bispo and João M.P. Cardoso. 2020. Clava: C/C++ source-to-source compilation using LARA. SoftwareX 12 (2020). https://doi.org/10.1016/j.softx.2020.100565Google ScholarGoogle ScholarCross RefCross Ref
  5. Joao Mario Domingos, Nuno Neves, Nuno Roma, and Pedro Tomás. 2021. Unlimited Vector Extension with Data Streaming Support. In ACM/IEEE 48th Annual Intl. Symp. on Computer Architecture (ISCA). 209–222. https://doi.org/10.1109/ISCA52012.2021.00025Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Carlo Galuzzi and Koen Bertels. 2011. The Instruction-Set Extension Problem: A Survey. ACM Trans. Reconfigurable Technol. Syst. 4, 2, Article 18 (may 2011), 28 pages. https://doi.org/10.1145/1968502.1968509Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Michael I. Gordon, William Thies, Michal Karczmarek, Jasper Lin, Ali S. Meli, Andrew A. Lamb, Chris Leger, Jeremy Wong, Henry Hoffmann, David Maze, and Saman Amarasinghe. 2002. A Stream Compiler for Communication-Exposed Architectures. In Proc. of the 10th Intl. Conference on Architectural Support for Programming Languages and Operating Systems. 291–303. https://doi.org/10.1145/605397.605428Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Paul Grigoras, Xinyu Niu, Jose G. F. Coutinho, Wayne Luk, Jacob Bower, and Oliver Pell. 2013. Aspect driven compilation for dataflow designs. In IEEE 24th Intl. Conference on Application-Specific Systems, Architectures and Processors. 18–25. https://doi.org/10.1109/ASAP.2013.6567545Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Marie-Christine Jakobs, Felix Pauck, Marco Platzner, Heike Wehrheim, and Tobias Wiersema. 2021. Software/Hardware Co-Verification for Custom Instruction Set Processors. IEEE Access 9 (2021). https://doi.org/10.1109/ACCESS.2021.3131213Google ScholarGoogle ScholarCross RefCross Ref
  10. Matthew Johns and Tom J. Kazmierski. 2020. A Minimal RISC-V Vector Processor for Embedded Systems. In Forum for Specification and Design Languages (FDL). 1–4. https://doi.org/10.1109/FDL50818.2020.9232940Google ScholarGoogle ScholarCross RefCross Ref
  11. David Koeplinger, Matthew Feldman, Raghu Prabhakar, Yaqi Zhang, Stefan Hadjis, Ruben Fiszel, Tian Zhao, Luigi Nardi, Ardavan Pedram, Christos Kozyrakis, and Kunle Olukotun. 2018. Spatial: A Language and Compiler for Application Accelerators. In Proc. of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. New York, NY, USA, 296–311. https://doi.org/10.1145/3192366.3192379Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Bastian Koppelmann, Peer Adelt, Wolfgang Mueller, and Christoph Scheytt. 2019. RISC-V Extensions for Bit Manipulation Instructions. In 29th Intl. Symp. on Power and Timing Modeling, Optimization and Simulation (PATMOS). 41–48. https://doi.org/10.1109/PATMOS.2019.8862170Google ScholarGoogle ScholarCross RefCross Ref
  13. C. Lattner and V. Adve. 2004. LLVM: a compilation framework for lifelong program analysis & transformation. In Intl. Symp. on Code Generation and Optimization. 75–86. https://doi.org/10.1109/CGO.2004.1281665Google ScholarGoogle ScholarCross RefCross Ref
  14. LLVM Project. 2022. Clang: a C language family frontend for LLVM. https://clang.llvm.org/Google ScholarGoogle Scholar
  15. Sparsh Mittal. 2020. A survey of FPGA-based accelerators for convolutional neural networks. Neural Computing and Applications 32, 4 (01 Feb 2020), 1109–1139.Google ScholarGoogle Scholar
  16. Nuno Neves, Joao Mario Domingos, Nuno Roma, Pedro Tomas, and Gabriel Falcao. 2022. Compiling for Vector Extensions with Stream-based Specialization. IEEE Micro (2022), 49–58. https://doi.org/10.1109/MM.2022.3173405Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nuno Paulino, João Canas Ferreira, and João M. P. Cardoso. 2020. Optimizing OpenCL Code for Performance on FPGA: k-Means Case Study With Integer Data Sets. IEEE Access 8 (2020). https://doi.org/10.1109/ACCESS.2020.3017552Google ScholarGoogle ScholarCross RefCross Ref
  18. Francesco Peverelli, Marco Rabozzi, Emanuele Del Sozzo, and Marco D. Santambrogio. 2018. OXiGen: A tool for automatic acceleration of c functions into dataflow FPGA-based kernels. In Proc. of the IEEE 32nd Intl. Parallel and Distributed Processing Symp. Workshops, IPDPSW 2018. 91–98. https://doi.org/10.1109/IPDPSW.2018.00023Google ScholarGoogle ScholarCross RefCross Ref
  19. Pedro Pinto, Tiago Carvalho, João Bispo, and João M. P. Cardoso. 2017. LARA as a Language-Independent Aspect-Oriented Programming Approach. In Proc. of the Symp. on Applied Computing. New York, NY, USA, 1623–1630. https://doi.org/10.1145/3019612.3019749Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Pouchet Louis-Noël. 15. PolyBench/C - the Polyhedral Benchmark suite. http://web.cse.ohio-state.edu/$$pouchet.2/software/polybench/Google ScholarGoogle Scholar
  21. Dan Quinlan and Chunhua Liao. 2011. The ROSE source-to-source compiler infrastructure. In Cetus users and compiler infrastructure workshop, in conjunction with PACT, Vol. 2011. Citeseer, 1.Google ScholarGoogle Scholar
  22. RISC-V Software. 2022. RISC-V Vector Extension 1.0. https://github.com/riscv/riscv-v-spec/releases/tag/v1.0.Google ScholarGoogle Scholar
  23. RISC-V Software. 2022. Spike RISC-V ISA Simulator. https://github.com/riscv-software-src/riscv-isa-sim.Google ScholarGoogle Scholar
  24. Fabian Schuiki, Florian Zaruba, Torsten Hoefler, and Luca Benini. 2021. Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores. IEEE Trans. Comput. 70, 2 (2021), 212–227. https://doi.org/10.1109/TC.2020.2987314Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Hafsah Shahzad, Ahmed Sanaullah, and Martin Herbordt. 2021. Survey and Future Trends for FPGA Cloud Architectures. In IEEE High Performance Extreme Computing Conference. 1–10. https://doi.org/10.1109/HPEC49654.2021.9622807Google ScholarGoogle ScholarCross RefCross Ref
  26. Nigel Stephens, Stuart Biles, Matthias Boettcher, Jacob Eapen, Mbou Eyole, Giacomo Gabrielli, Matt Horsnell, Grigorios Magklis, Alejandro Martinez, Nathanael Premillieu, Alastair Reid, Alejandro Rico, and Paul Walker. 2017. The ARM Scalable Vector Extension. IEEE Micro 37, 2 (2017), 26–39. https://doi.org/10.1109/MM.2017.35Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. S. Summers, A. Rose, and P. Sanders. 2017. Using MaxCompiler for the high level synthesis of trigger algorithms. Journal of Instrumentation 12, 02 (feb 2017), C02015. https://doi.org/10.1088/1748-0221/12/02/C02015Google ScholarGoogle ScholarCross RefCross Ref
  28. Etienne Tehrani, Tarik Graba, Abdelmalek Si Merabet, and Jean-Luc Danger. 2020. RISC-V Extension for Lightweight Cryptography. In 23rd Euromicro Conference on Digital System Design. 222–228. https://doi.org/10.1109/DSD51259.2020.00045Google ScholarGoogle ScholarCross RefCross Ref
  29. Jessica Vandebon, Jose G. F. Coutinho, Wayne Luk, Eriko Nurvitadhi, and Tim Todman. 2020. Artisan: a Meta-Programming Approach For Codifying Optimisation Strategies. In IEEE 28th Annual Intl. Symp. on Field-Programmable Custom Computing Machines (FCCM). 177–185. https://doi.org/10.1109/FCCM48280.2020.00032Google ScholarGoogle ScholarCross RefCross Ref
  30. Yaqi Zhang, Nathan Zhang, Tian Zhao, Matt Vilim, Muhammad Shahbaz, and Kunle Olukotun. 2021. SARA: Scaling a Reconfigurable Dataflow Accelerator. In ACM/IEEE 48th Annual Intl. Symp. on Computer Architecture (ISCA). 1041–1054. https://doi.org/10.1109/ISCA52012.2021.00085Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yuzhi Zhou, Xi Jin, and Tian Xiang. 2020. RISC-V Graphics Rendering Instruction Set Extensions for Embedded AI Chips Implementation. In Proc. of the 2020 2nd Intl. Conference on Big Data Engineering and Technology. 85–88. https://doi.org/10.1145/3378904.3378926Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using Source-to-Source to Target RISC-V Custom Extensions: UVE Case-Study

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        RAPIDO '24: Proceedings of the 16th Workshop on Rapid Simulation and Performance Evaluation for Design
        January 2024
        54 pages
        ISBN:9798400717918
        DOI:10.1145/3642921

        Copyright © 2024 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 6 March 2024

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate14of28submissions,50%
      • Article Metrics

        • Downloads (Last 12 months)121
        • Downloads (Last 6 weeks)51

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format