research-article

A Hypervisor for Shared-Memory FPGA Platforms

Authors:
Jiacheng Ma

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Gefei Zuo

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Kevin Loughlin

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Xiaohe Cheng

Hong Kong University of Science and Technology, Hong Kong, China

Hong Kong University of Science and Technology, Hong Kong, China
View Profile

,
Yanqiang Liu

Shanghai Jiao Tong University, Shanghai, China

Shanghai Jiao Tong University, Shanghai, China
View Profile

,
Abel Mulugeta Eneyew

Addis Ababa Institute of Technology, Addis Ababa, Ethiopia

Addis Ababa Institute of Technology, Addis Ababa, Ethiopia
View Profile

,
Zhengwei Qi

Shanghai Jiao Tong University, Shanghai, China

Shanghai Jiao Tong University, Shanghai, China
View Profile

,
Baris Kasikci

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating SystemsMarch 2020Pages 827–844https://doi.org/10.1145/3373376.3378482

Published:13 March 2020Publication History

ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems

Pages 827–844

ABSTRACT

Cloud providers widely deploy FPGAs as application-specific accelerators for customer use. These providers seek to multiplex their FPGAs among customers via virtualization, thereby reducing running costs. Unfortunately, most virtualization support is confined to FPGAs that expose a restrictive, host-centric programming model in which accelerators cannot issue direct memory accesses (DMAs). The host-centric model incurs high runtime overhead for workloads that exhibit pointer chasing. Thus, FPGAs are beginning to support a shared-memory programming model in which accelerators can issue DMAs. However, virtualization support for shared-memory FPGAs is limited. This paper presents Optimus, the first hypervisor that supports scalable shared-memory FPGA virtualization. Optimus offers both spatial multiplexing and temporal multiplexing to provide efficient and flexible sharing of each accelerator on an FPGA. To share the FPGA-CPU interconnect at a high clock frequency, Optimus implements a multiplexer tree. To isolate each guest's address space, Optimus introduces the technique of page table slicing as a hardware-software co-design. To support preemptive temporal multiplexing, Optimus provides an accelerator preemption interface. We show that Optimus supports eight physical accelerators on a single FPGA and improves the aggregate throughput of twelve real-world benchmarks by 1.98x-7x.

References

[n.d.]. GP100 Pascal Whitepaper. https://images.nvidia.com/content/ pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf.Google Scholar
[n.d.]. Hugetlbfs Reservation. https://www.kernel.org/doc/html/v4. 18/vm/hugetlbfs_reserv.html.Google Scholar
[n.d.]. Open-Source FPGA Bitcoin Miner. https://github.com/progran ism/Open-Source-FPGA-Bitcoin-Miner.Google Scholar
[n.d.]. Transparent huge pages in 2.6.38. https://lwn.net/Articles /423584/.Google Scholar
[n.d.]. Transparent Hugepage Support. https://www.kernel.org/doc/D ocumentation/vm/transhuge.txt.Google Scholar
Amazon. [n.d.]. Amazon EC2 F1 Instances - Run Customizable FPGAs in the AWS Cloud. https://aws.amazon.com/ec2/instance-types/f1.Google Scholar
Amazon. [n.d.]. Official repository of the AWS EC2 FPGA Hardware and Software Development Kit. https://github.com/aws/aws-fpga.Google Scholar
Mikhail Asiatici, Nithin George, Kizheppatt Vipin, Suhaib A Fahmy, and Paolo Ienne. 2017. Virtualized execution runtime for fpga accelerators in the cloud. Ieee Access 5 (2017), 1900--1910.Google ScholarCross Ref
Systems Group at ETH Zurich. [n.d.]. Enzian is a research computer built by the Systems Group at ETH Zurich. http://www.enzian.syste ms/.Google Scholar
Osama G Attia, Tyler Johnson, Kevin Townsend, Philip Jones, and Joseph Zambreno. 2014. Cygraph: A reconfigurable architecture for parallel breadth-first search. In 2014 IEEE International Parallel & Distributed Processing Symposium Workshops. IEEE, 228--235.Google ScholarDigital Library
Rachata Ausavarungnirun, Joshua Landgraf, Vance Miller, Saugata Ghose, Jayneel Gandhi, Christopher J. Rossbach, and Onur Mutlu. 2018. Mosaic: Enabling Application-Transparent Support for Multiple Page Sizes in Throughput Processors. SIGOPS Oper. Syst. Rev. 52, 1 (Aug. 2018), 27--44. https://doi.org/10.1145/3273982.3273986Google ScholarDigital Library
Jayaram Bhasker. 1999. A Vhdl Primer. Prentice-Hall.Google Scholar
Alexander Brant and Guy GF Lemieux. 2012. ZUMA: An open FPGA overlay architecture. In 2012 IEEE 20th international symposium on field-programmable custom computing machines. IEEE, 93--96.Google ScholarDigital Library
Doug Burger. 2017. Microsoft unveils Project Brainwave for real-time AI. Microsoft Research, Microsoft 22 (2017).Google Scholar
Stuart Byma, J Gregory Steffan, Hadi Bannazadeh, Alberto Leon Garcia, and Paul Chow. 2014. Fpgas in the cloud: Booting virtualized hardware accelerators with openstack. In Field-Programmable Custom Computing Machines (FCCM), 2014 IEEE 22nd Annual International Symposium on. IEEE, 109--116.Google ScholarCross Ref
Adrian M Caulfield, Eric S Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, et al. 2016. A cloud-scale acceleration architecture. In The 49th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Press, 7.Google ScholarDigital Library
Ciro Ceissler, Ramon Nepomuceno, Marcio Pereira, and Guido Araujo. 2018. Automatic Offloading of Cluster Accelerators. In 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 224--224.Google Scholar
Fei Chen, Yi Shan, Yu Zhang, Yu Wang, Hubertus Franke, Xiaotao Chang, and Kun Wang. 2014. Enabling FPGAs in the cloud. In Proceedings of the 11th ACM Conference on Computing Frontiers. ACM, 3.Google ScholarDigital Library
Eric S Chung, James C Hoe, and Ken Mai. 2011. CoRAM: an in-fabric memory architecture for FPGA-based computing. In Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays. ACM, 97--106.Google ScholarDigital Library
L. Dagum and R. Menon. 1998. OpenMP: an industry standard API for shared-memory programming. IEEE Computational Science and Engineering 5, 1 (Jan 1998), 46--55. https://doi.org/10.1109/99.660313Google ScholarDigital Library
Suhaib A Fahmy, Kizheppatt Vipin, and Shanker Shreejith. 2015. Virtualized FPGA accelerators for efficient cloud computing. In Cloud Computing Technology and Science (CloudCom), 2015 IEEE 7th International Conference on. IEEE, 430--435.Google ScholarDigital Library
K. Fleming, H. Yang, M. Adler, and J. Emer. 2014. The LEAP FPGA operating system. In 2014 24th International Conference on Field Programmable Logic and Applications (FPL). 1--8. https://doi.org/10.1109/ FPL.2014.6927488Google ScholarCross Ref
Gokul Govindu, Ronald Scrofano, and Viktor K Prasanna. 2005. A library of parameterizable floating-point cores for FPGAs and their application to scientific computing. In Proc Int'l Conf. Eng. Reconfigurable Systems and Algorithms (ERSA'05). Citeseer.Google Scholar
Intel. [n.d.]. Acceleration Stack for Intel Xeon CPU with FPGAs Core Cache Interface (CCI-P) Reference Manual. https://www.altera.com/content/dam/altera-www/global/en_ US/pdfs/literature/manual/mnl-ias-ccip.pdf.Google Scholar
Intel. [n.d.]. Hardware Accelerator Research Program. https://softwar e.intel.com/en-us/hardware-accelerator-research-program.Google Scholar
Intel. [n.d.]. Intel Arria 10 Avalon-ST Interface with SR-IOV PCIe Solutions User Guide. https://www.altera.com/en_US/pdfs/literature /ug/ug_a10_pcie_sriov.pdf.Google Scholar
Intel. [n.d.]. Intel Programmable Acceleration Card with Intel Arria 10 GX FPGA. https://www.intel.com/content/www/us/en/programm able/products/boards_and_kits/dev-kits/altera/acceleration-cardarria- 10-gx.html.Google Scholar
Intel. [n.d.]. Intel Virtualization Technology for Directed I/O. https://software.intel.com/sites/default/files/managed/c5/15/vtdirected- io-spec.pdf.Google Scholar
Intel. [n.d.]. Open Programmable Acceleration Engine. https://opae.g ithub.io/latest/index.html.Google Scholar
Intel. 2017. Intel Open Source HD Graphics and Intel Iris Plus Graphics Programmer's Reference Manual for the 2016 - 2017 Intel Core Processors, Celeron Processors, and Pentium Processors based on the "Kaby Lake" Platform. https://01.org/sites/default/files/documentation/intelgfx- prm-osrc-kbl-vol05-memory_views.pdf.Google Scholar
Intel. 2019. Embedded Peripherals IP User Guide. https: //www.intel.com/content/dam/www/programmable/us/en/pdf s/literature/ug/ug_embedded_ip.pdf.Google Scholar
Intel. 2019. Intel Arria 10 FPGAs. https://www.intel.com/content/ww w/us/en/products/programmable/fpga/arria-10.html.Google Scholar
Intel. 2019. Intel FPGA Basic Building Blocks (BBB). https://github.c om/OPAE/intel-fpga-bbb.Google Scholar
Abhishek Kumar Jain, Suhaib A Fahmy, and Douglas L Maskell. 2015. Efficient Overlay architecture based on DSP blocks. In 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines. IEEE, 25--28.Google ScholarDigital Library
Abhishek Kumar Jain, Douglas L Maskell, and Suhaib A Fahmy. 2016. Throughput oriented FPGA overlays using DSP blocks. In 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1628--1633.Google Scholar
Neo Jia and Kirti Wankhede. [n.d.]. VFIO Mediated devices. https: //www.kernel.org/doc/Documentation/vfio-mediated-device.txt.Google Scholar
Ahmed Khawaja, Joshua Landgraf, Rohith Prakash, Michael Wei, Eric Schkufza, and Christopher J Rossbach. 2018. Sharing, Protection, and Compatibility for Reconfigurable Fabric with AmorphOS. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, 107--127.Google ScholarDigital Library
Moein Khazraee, Lu Zhang, Luis Vega, and Michael Bedford Taylor. 2017. Moonwalk: NRE Optimization in ASIC Clouds. SIGPLAN Not. 52, 4 (April 2017), 511--526. https://doi.org/10.1145/3093336.3037749Google Scholar
Avi Kivity, Yaniv Kamay, Dor Laor, Uri Lublin, and Anthony Liguori. 2007. KVM: the Linux Virtual Machine Monitor. In In Proceedings of the 2007 Ottawa Linux Symposium (OLS).Google Scholar
Oliver Knodel, Paul R Genssler, and Rainer G Spallek. 2017. Virtualizing Reconfigurable Hardware to Provide Scalability in Cloud Architectures. Reconfigurable Architectures, Tools and Applications, RECATA (2017).Google Scholar
Oliver Knodel and Rainer G Spallek. 2015. RC3E: provision and management of reconfigurable hardware accelerators in a cloud environment. arXiv preprint arXiv:1508.06843 (2015).Google Scholar
Dirk Koch, Christian Beckhoff, and Guy GF Lemieux. 2013. An efficient FPGA overlay for portable custom instruction set extensions. In 2013 23rd international conference on field programmable logic and applications. IEEE, 1--8.Google ScholarCross Ref
Patrick Kutch. 2011. Pci-sig sr-iov primer: An introduction to sr-iov technology. Intel application note (2011), 321211--002.Google Scholar
Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J Rossbach, and Emmett Witchel. 2016. Coordinated and efficient huge page management with ingens. In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16). 705--721.Google Scholar
Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J Rossbach, and Emmett Witchel. 2017. Ingens: Huge Page Support for the OS and Hypervisor. ACM SIGOPS Operating Systems Review 51, 1 (2017), 83--93.Google ScholarDigital Library
Doug Lea. [n.d.]. A Memory Allocator. http://gee.cs.oswego.edu/dl/h tml/malloc.html.Google Scholar
W. Li, G. Jin, X. Cui, and S. See. 2015. An Evaluation of Unified Memory Technology on NVIDIA GPUs. In 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. 1092--1098. https: //doi.org/10.1109/CCGrid.2015.105Google Scholar
Enno Lübbers and Marco Platzner. 2009. ReconOS: Multithreaded programming for reconfigurable computers. ACM Transactions on Embedded Computing Systems (TECS) 9, 1 (2009), 8.Google ScholarDigital Library
Theodore Michailidis, Alex Delis, and Mema Roussopoulos. 2019. MEGA: overcoming traditional problems with OS huge page management. In Proceedings of the 12th ACM International Conference on Systems and Storage. ACM, 121--131.Google ScholarDigital Library
Microsoft. 2019. What are FPGAs and Project Brainwave? https://docs.microsoft.com/en-us/azure/machinelearning/ service/concept-accelerate-with-fpgas.Google Scholar
David Mulnix. 2017. Intel Xeon processor scalable family technical overview.Google Scholar
Muhsen Owaida, David Sidler, Kaan Kara, and Gustavo Alonso. 2017. Centaur: A framework for hybrid CPU-FPGA databases. In Field- Programmable Custom Computing Machines (FCCM), 2017 IEEE 25th Annual International Symposium on. IEEE, 211--218.Google ScholarCross Ref
Michele Paolino, Sébastien Pinneterre, and Daniel Raho. 2017. FPGA virtualization with accelerators overcommitment for Network Function Virtualization. In 2017 International Conference on ReConFigurable Computing and FPGAs (ReConFig). IEEE, 1--6.Google ScholarCross Ref
Wesley Peck, Erik Anderson, Jason Agron, Jim Stevens, Fabrice Baijot, and David Andrews. 2006. Hthreads: A computational model for reconfigurable devices. In 2006 International Conference on Field Programmable Logic and Applications. IEEE, 1--4.Google ScholarCross Ref
Sébastien Pinneterre, Spyros Chiotakis, Michele Paolino, and Daniel Raho. 2018. vFPGAmanager: A virtualization framework for orchestrated FPGA accelerator sharing in 5G cloud environments. In 2018 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). IEEE, 1--5.Google ScholarCross Ref
Andrew Putnam, Adrian M Caulfield, Eric S Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, et al. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. ACM SIGARCH Computer Architecture News 42, 3 (2014), 13--24.Google ScholarDigital Library
W. Qiao, J. Du, Z. Fang, M. Lo, M. F. Chang, and J. Cong. 2018. High- Throughput Lossless Compression on Tightly Coupled CPU-FPGA Platforms. In 2018 IEEE 26th Annual International Symposium on Field- Programmable Custom Computing Machines (FCCM). 37--44. https: //doi.org/10.1109/FCCM.2018.00015Google Scholar
Nikolay Sakharnykh. 2018. EVERYTHING YOU NEED TO KNOW ABOUT UNIFIED MEMORY. http://on-demand.gputechconf.com/g tc/2018/presentation/s8430-everything-you-need-to-know-aboutunified- memory.pdf.Google Scholar
Eric Schkufza, Michael Wei, and Christopher J Rossbach. 2019. Just- In-Time Compilation for Verilog: A New Technique for Improving the FPGA Programming Experience. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 271--286.Google ScholarDigital Library
Hardik Sharma, Jongse Park, Emmanuel Amaro, Bradley Thwaites, Praneetha Kotha, Anmol Gupta, Joon Kyung Kim, Asit Mishra, and Hadi Esmaeilzadeh. 2016. Dnnweaver: From high-level deep network models to fpga acceleration. In theWorkshop on Cognitive Architectures.Google Scholar
Hardik Sharma, Jongse Park, Divya Mahajan, Emmanuel Amaro, Joon Kyung Kim, Chenkai Shao, Asit Mishra, and Hadi Esmaeilzadeh. 2016. From high-level deep neural models to FPGAs. In Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 1--12.Google ScholarDigital Library
David Sidler, Zsolt István, Muhsen Owaida, and Gustavo Alonso. 2017. Accelerating pattern matching queries in hybrid CPU-FPGA architectures. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 403--415.Google ScholarDigital Library
Any Silicon. [n.d.]. FPGA vs ASIC, What to Choose? https://anysilic on.com/fpga-vs-asic-choose/.Google Scholar
Hayden Kwok-Hay So and Robert Brodersen. 2008. A unified hardware/ software runtime environment for FPGA-based reconfigurable computers using BORPH. ACM Transactions on Embedded Computing Systems (TECS) 7, 2 (2008), 14.Google Scholar
Hayden Kwok-Hay So and Robert W Brodersen. 2007. Borph: An operating system for fpga-based reconfigurable computers. University of California, Berkeley.Google Scholar
J. Stuecheli, B. Blaner, C. R. Johns, and M. S. Siegel. 2015. CAPI: A Coherent Accelerator Processor Interface. IBM Journal of Research and Development 59, 1 (Jan 2015), 7:1--7:7. https://doi.org/10.1147/JR D.2014.2380198Google ScholarDigital Library
Naif Tarafdar, Thomas Lin, Eric Fukuda, Hadi Bannazadeh, Alberto Leon-Garcia, and Paul Chow. 2017. Enabling flexible network FPGA clusters in a heterogeneous cloud data center. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 237--246.Google ScholarDigital Library
Terasic and Altera. [n.d.]. DE5a-Net FPGA Development Kit User Manual. https://www.intel.com/content/dam/alterawww/ global/en_US/portal/dsn/42/doc-us-dsnbk-42--1804382103- de5a-net-user-manual.pdf.Google Scholar
Donald Thomas and Philip Moorby. 2008. The Verilog® Hardware Description Language. Springer Science & Business Media.Google Scholar
Kun Tian, Yaozu Dong, and David Cowperthwaite. 2014. A Full GPU Virtualization Solution with Mediated Pass-Through.. In USENIX Annual Technical Conference. 121--132.Google Scholar
Anuj Vaishnav, Khoa Dang Pham, and Dirk Koch. 2018. A survey on fpga virtualization. In 2018 28th International Conference on Field Programmable Logic and Applications (FPL). IEEE, 131--1317.Google ScholarCross Ref
Duy Viet Vu, Oliver Sander, Timo Sandmann, Steffen Baehr, Jan Heidelberger, and Juergen Becker. 2014. Enabling partial reconfiguration for coprocessors in mixed criticality multicore systems using PCI Express Single-Root I/O Virtualization. In 2014 International Conference on ReConFigurable Computing and FPGAs (ReConFig14). IEEE, 1--6.Google ScholarCross Ref
WeiWang, Miodrag Bolic, and Jonathan Parri. 2013. pvFPGA: accessing an FPGA-based hardware accelerator in a paravirtualized environment. In Hardware/Software Codesign and System Synthesis (CODES+ ISSS), 2013 International Conference on. IEEE, 1--9.Google Scholar
JagathWeerasinghe, Francois Abel, Christoph Hagleitner, and Andreas Herkersdorf. 2015. Enabling FPGAs in hyperscale data centers. In Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), 2015 IEEE 12th Intl Conf on. IEEE, 1078--1086.Google Scholar
Gabriel Weisz and James C Hoe. 2015. CoRAM++: Supporting datastructure- specific memory interfaces for FPGA computing. In 2015 25th International Conference on Field Programmable Logic and Applications (FPL). IEEE, 1--8.Google ScholarCross Ref
Gabriel Weisz, Joseph Melber, Yu Wang, Kermin Fleming, Eriko Nurvitadhi, and James C Hoe. 2016. A study of pointer-chasing performance on shared-memory processor-FPGA systems. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 264--273.Google ScholarDigital Library
Gabriel Weisz, Joseph Melber, Yu Wang, Kermin Fleming, Eriko Nurvitadhi, and James C. Hoe. 2016. A Study of Pointer-Chasing Performance on Shared-Memory Processor-FPGA Systems. In Proceedings of the 2016 ACM/SIGDA International Symposium on Field- Programmable Gate Arrays (FPGA '16). ACM, New York, NY, USA, 264--273. https://doi.org/10.1145/2847263.2847269Google Scholar
Lei Xia, Sanjay Kumar, Xue Yang, Praveen Gopalakrishnan, York Liu, Sebastian Schoenberg, and Xingang Guo. 2011. Virtual WiFi: bring virtualization from wired to wireless. In Acm sigplan notices, Vol. 46. ACM, 181--192.Google Scholar
Xilinx. [n.d.]. AXI Interconnect. https://www.xilinx.com/products/in tellectual-property/axi_interconnect.html.Google Scholar
Xilinx. [n.d.]. Designing with SR-IOV Capability of Xilinx Virtex-7 PCI Express Gen3 Integrated Block. https://www.xilinx.com/support /documentation/application_notes/xapp1177-pcie-gen3-sriov.pdf.Google Scholar
Xilinx. [n.d.]. DMA for PCI Express (PCIe) Subsystem. https://www.xi linx.com/products/intellectual-property/pcie-dma.html.Google Scholar
Xilinx. [n.d.]. SDAccel Development Environment. https://www.xili nx.com/products/design-tools/software-zone/sdaccel.html.Google Scholar
Peter Xu. 2018. Device Assignment with Nested Guest and DPDK. https://www.linux-kvm.org/images/a/a6/KVM_Forum_2018_ viommu_vfio.pdf.Google Scholar
Hangchen Yu, Arthur M. Peters, Amogh Akshintala, and Christopher J. Rossbach. 2019. Automatic Virtualization of Accelerators. In 17th Workshop on Hot Topics in Operating Systems (HotOS {XVII}).Google Scholar
H. Zeng, C. Zhang, and V. Prasanna. 2017. Fast Generation of High Throughput Customized Deep Learning Accelerators on FPGAs. In 2017 International Conference on ReConFigurable Computing and FPGAs (ReConFig). 1--8. https://doi.org/10.1109/RECONFIG.2017.8279792Google Scholar
Jiansong Zhang, Yongqiang Xiong, Ningyi Xu, Ran Shu, Bojie Li, Peng Cheng, Guo Chen, and Thomas Moscibroda. 2017. The Feniks FPGA Operating System for Cloud Computing. In Proceedings of the 8th Asia-Pacific Workshop on Systems. ACM, 22.Google ScholarDigital Library
Ritchie Zhao, Weinan Song, Wentao Zhang, Tianwei Xing, Jeng-Hau Lin, Mani Srivastava, Rajesh Gupta, and Zhiru Zhang. 2017. Accelerating binarized convolutional neural networks with softwareprogrammable fpgas. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 15--24.Google ScholarDigital Library
Tianhao Zheng, David Nellans, Arslan Zulfiqar, Mark Stephenson, and Stephen W Keckler. 2016. Towards high performance paged memory for GPUs. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 345--357.Google ScholarCross Ref
S. Zhou and V. K. Prasanna. 2017. Accelerating Graph Analytics on CPU-FPGA Heterogeneous Platform. In 2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). 137--144. https://doi.org/10.1109/SBAC-PAD.2017.25Google Scholar

Index Terms

A Hypervisor for Shared-Memory FPGA Platforms
1. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
2. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Software infrastructure
        Virtual machines

Recommendations

Microkernel hypervisor for a hybrid ARM-FPGA platform
ASAP '13: Proceedings of the 2013 IEEE 24th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

Reconfigurable architectures have found use in a wide range of application domains, but mostly as static accelerators for computationally intensive functions. Commodity computing adoption has not taken off due primarily to design complexity challenges. ...
Read More
A comprehensive performance analysis of virtual routers on FPGA
Special Section on 19th Reconfigurable Architectures Workshop (RAW 2012)

Network virtualization has gained much popularity with the advent of datacenter networking. The hardware aspect of network virtualization, router virtualization, allows network service providers to consolidate network hardware, reducing equipment cost ...
Read More
A Hypervisor Approach to Enable Live Migration with Passthrough SR-IOV Network Devices
Special Topics

Single-Root I/O Virtualization (SR-IOV) is a specification that allows a single PCI Express (PCIe) device (physical function or PF) to be used as multiple PCIe devices (virtual functions or VF). In a virtualization system, each VF can be directly ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems
March 2020
1412 pages
ISBN:9781450371025
DOI:10.1145/3373376
General Chair:
James Larus
EPFL
,
Program Chairs:
Luis Ceze
University of Washington
,
Karin Strauss
Microsoft
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 March 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available
- Artifacts Evaluated & Reusable
Author Tags
fpga
optimus
virtualization
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate535of2,713submissions,20%
Upcoming Conference
ASPLOS '24

Sponsor:

sigarch

sigarch

sigarch

29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems

April 27 - May 1, 2024

La Jolla , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 28
  Total Citations
  View Citations
- 1,305
  Total Downloads
- Downloads (Last 12 months)239
- Downloads (Last 6 weeks)34
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Hypervisor for Shared-Memory FPGA Platforms

ASPLOS '20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Microkernel hypervisor for a hybrid ARM-FPGA platform

A comprehensive performance analysis of virtual routers on FPGA

A Hypervisor Approach to Enable Live Migration with Passthrough SR-IOV Network Devices