Abstract
The computer architecture landscape is being reshaped by the new opportunities, challenges, and constraints brought by the cloud. On the one hand, high-level applications profit from specialised hardware to boost their performance and reduce deployment costs. On the other hand, cloud providers maximise the CPU time allocated to client applications by offloading infrastructure tasks to hardware accelerators. While it is well understood how to do this for, e.g., network function virtualisation and protocols such as TCP/IP, support for higher networking layers is still largely missing, limiting the potential of accelerators. In this article, we present Strega, an open source1 light-weight Hypertext Transfer Protocol (HTTP) server that enables crucial functionality such as FPGA-accelerated functions being called through a RESTful protocol (FPGA-as-a-Function). Our experimental analysis shows that a single Strega node sustains a throughput of 1.7 M HTTP requests per second with an end-to-end latency as low as 16, μs, outperforming nginx running on 32 vCPUs in both metrics, and can even be an alternative to the traditional OpenCL flow over the PCIe bus. Through this work, we pave the way for running microservices directly on FPGAs, bypassing CPU overhead and realising the full potential of FPGA acceleration in distributed cloud applications.
- [1] . 2023. Decomposition of monolith applications into microservices architectures: A systematic review. IEEE Trans. Softw. Eng. (2023), 1–32. Google ScholarDigital Library
- [2] . 2022. Microservices architecture: .NET microservices—Architecture e-book. Retrieved from https://learn.microsoft.com/en-us/dotnet/architecture/microservices/architect-microservice-container-applications/microservices-architectureGoogle Scholar
- [3] . 2020. BlastFunction: An FPGA-as-a-service system for accelerated serverless computing. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE’20). IEEE, 852–857. Google ScholarCross Ref
- [4] . 2015. Hypertext Transfer Protocol Version 2 (HTTP/2). RFC 7540. Google ScholarDigital Library
- [5] . 2002. Protocol wrappers for layered network packet processing in reconfigurable hardware. IEEE Micro 22, 1 (2002), 66–74. Google ScholarDigital Library
- [6] . 2016. FPGA-based web services—Infinite potential or a road to nowhere? IEEE Internet Comput. 20, 1 (2016), 44–51. Google ScholarDigital Library
- [7] . 2013. Remotely reconfigurable hardware-software platform with web service interface for automated video surveillance. J. Syst. Archit. 59, 7 (2013), 376–388. Google ScholarDigital Library
- [8] . 2015. Apache flink™: Stream and batch processing in a single engine. IEEE Data Eng. Bull. 38, 4 (2015), 28–38. http://sites.computer.org/debull/A15dec/p28.pdfGoogle Scholar
- [9] . 2009. An implementation of embedded RESTful Web services. In Proceedings of the Innovative Technologies in Intelligent Systems and Industrial Applications. 45–50. Google ScholarCross Ref
- [10] . 2022. Hardware acceleration of compression and encryption in SAP HANA. Proc. VLDB Endow. 15, 12 (2022), 3277–3291.Google ScholarDigital Library
- [11] . 2019. hlslib: Software engineering for hardware design.
arXiv:1910.04436 . Retrieved from http://arxiv.org/abs/1910.04436Google Scholar - [12] . 2013. SOA with REST—Principles, Patterns and Constraints for Building Enterprise Solutions with REST. Pearson Education.Google Scholar
- [13] . 2019. A modular heterogeneous stack for deploying FPGAs and CPUs in the data center. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’19). ACM, 262–271. Google ScholarDigital Library
- [14] . 2016. FPGA Webserver. Retrieved from https://github.com/hamsternz/FPGA_WebserverGoogle Scholar
- [15] . 2000. Architectural Styles and the Design of Network-based Software Architectures. Ph. D. Dissertation. University of California, Irvine, CA.Google ScholarDigital Library
- [16] . 2022. HTTP Semantics. RFC 9110. Google ScholarDigital Library
- [17] . 2022. HTTP/1.1. RFC 9112. Google ScholarDigital Library
- [18] . 2018. Azure accelerated networking: SmartNICs in the public cloud. In Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI’18). USENIX Association, 51–66.Google Scholar
- [19] . 2004. Distributed caching with memcached. Linux J. 2004, 124 (2004).Google ScholarDigital Library
- [20] . 2013. Network interface design for low latency request-response protocols. In Proceedings of the USENIX Annual Technical Conference. USENIX Association, 333–346.Google ScholarDigital Library
- [21] . 2019. An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’19), , , , and (Eds.). ACM, 3–18. Google ScholarDigital Library
- [22] . 2023. Configuring Warmup Requests to Improve Performance. Retrieved from https://cloud.google.com/appengine/docs/legacy/standard/java/configuring-warmup-requestsGoogle Scholar
- [23] . 2023. What Is Microservices Architecture? Google Cloud Topics. Retreived from https://cloud.google.com/learn/what-is-microservices-architectureGoogle Scholar
- [24] . 2013. Apache drill: Interactive Ad-Hoc analysis at scale. Big Data 1, 2 (2013), 100–104. Google ScholarCross Ref
- [25] . 2021. EasyNet: 100 Gbps network for HLS. In Proceedings of the 31st International Conference on Field-Programmable Logic and Applications (FPL’21). IEEE, 197–203. Google ScholarCross Ref
- [26] . 1996. Information technology–Open Systems Interconnection–Basic Reference Model: The Basic Model.
Standard . International Organization for Standardization, Geneva, CH.Google Scholar - [27] . 2017. Caribou: Intelligent distributed storage. Proc. VLDB Endow. 10, 11 (2017), 1202–1213. Google ScholarDigital Library
- [28] . 2020. Modeling microservice conversations with RESTalk. In Microservices, Science and Engineering. Springer, 129–146. Google ScholarCross Ref
- [29] . 2017. Optimizing Web Servers for High Throughput and Low Latency. Retrieved from https://dropbox.tech/infrastructure/optimizing-web-servers-for-high-throughput-and-low-latency.Google Scholar
- [30] . 2015. RIFFA 2.1: A reusable integration framework for FPGA accelerators. ACM Trans. Reconfig. Technol. Syst. 8, 4 (2015), 22:1–22:23. Google ScholarDigital Library
- [31] . 2009. Embedded web server on Nios II embedded FPGA platform. In Proceedings of the 2nd International Conference on Emerging Trends in Engineering & Technology (ICETET’09). IEEE Computer Society, 372–377. Google ScholarDigital Library
- [32] . 2020. Do OS abstractions make sense on FPGAs? In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI’20). USENIX Association, 991–1010. https://www.usenix.org/conference/osdi20/presentation/roscoeGoogle Scholar
- [33] . 2011. Kafka: A distributed messaging system for log processing. In Proceedings of the NetDB, Vol. 11. 1–7.Google Scholar
- [34] . 2018. FPGA-based system for the acceleration of cloud microservices. In Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB’18). IEEE, 1–5. Google ScholarCross Ref
- [35] . 2021. Dagger: Efficient and fast RPCs in cloud microservices with near-memory reconfigurable NICs. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’21). ACM, 36–51. Google ScholarDigital Library
- [36] . 2009. An FPGA-based web server for high performance biological sequence alignment. In Proceedings of the NASA/ESA Conference on Adaptive Hardware and Systems (AHS’09). IEEE Computer Society, 361–368. Google ScholarDigital Library
- [37] . 2020. Pump up the volume: Processing large data on GPUs with fast interconnects. In Proceedings of the International Conference on Management of Data (SIGMOD’20), , , , , , and (Eds.). ACM, 1633–1649. Google ScholarDigital Library
- [38] . 2014. A FPGA embedded web server for remote monitoring and control of smart sensors networks. Sensors 14, 1 (2014), 416–430. Google ScholarCross Ref
- [39] . 2023. The difficult balance between modern hardware and conventional CPUs. In Proceedings of the 19th International Workshop on Data Management on New Hardware (DaMoN’23), and (Eds.). ACM, 53–62. Google ScholarDigital Library
- [40] . 2023. Serverless FPGA: Work-in-progress. In Proceedings of the 1st Workshop on SErverless Systems, Applications and MEthodologies (SESAME’23), , , , , and (Eds.). ACM, 1–4. Google ScholarDigital Library
- [41] . 2020. Making search engines faster by lowering the cost of querying business rules through FPGAs. In Proceedings of the International Conference on Management of Data (SIGMOD’20). ACM, 2255–2270. Google ScholarDigital Library
- [42] . 2023. DASH: Asynchronous hardware data processing services. In Proceedings of the 13th Conference on Innovative Data Systems Research (CIDR’23).Google Scholar
- [43] . 2023. Netflix Open Connect Appliances. Retrieved from https://openconnect.netflix.com/en/appliances/#softwareGoogle Scholar
- [44] . 2018. Understanding PCIe performance for end host networking. In Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’18). ACM, 327–341. Google ScholarDigital Library
- [45] . 2012. NGINX at WordPress.com. Retrieved from https://www.nginx.com/success-stories/nginx-wordpress-com/.Google Scholar
- [46] . 2018. Using FPGAs as Microservices: Technology, challenges and case study. In Proceedings of the 9th Workshop on Big Data Benchmarks Performance, Optimization and Emerging Hardware (BPOE-9’18). https://par.nsf.gov/biblio/10073091Google Scholar
- [47] . 2010. RESTful web services: Principles, patterns, emerging technologies. In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, 1359–1360. Google ScholarDigital Library
- [48] . 2022. Kernel-as-a-service: A serverless interface to GPUs.
arXiv:2212.08146 . Retrieved from https://arxiv.org/abs/2212.08146Google Scholar - [49] . 2017. RESTful web services on standalone disaggregated FPGAs. In IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2017). IEEE Computer Society, 114–121. Google ScholarCross Ref
- Retrospective: A reconfigurable fabric for accelerating large-scale datacenter services.Google Scholar . [n. d.].
- [51] . 2020. FIRM: An intelligent fine-grained resource management framework for SLO-oriented microservices. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI’20). USENIX Association, 805–825.Google Scholar
- [52] . 2021. A case for function-as-a-service with disaggregated FPGAs. In Proceedings of the 14th IEEE International Conference on Cloud Computing (CLOUD 2021). IEEE, 333–344. Google ScholarCross Ref
- [53] . 2019. Limago: An FPGA-based open-source 100 GbE TCP/IP stack. In Proceedings of the 29th International Conference on Field Programmable Logic and Applications (FPL’19), , , , , , and (Eds.). IEEE, 286–292. Google ScholarCross Ref
- [54] . 2012. On fast development of FPGA-based SOA services - machine vision case study. Des. Autom. Embed. Syst. 16, 1 (2012), 45–69. Google ScholarDigital Library
- [55] . 2021. What serverless computing is and should become: The next phase of cloud computing. Commun. ACM 64, 5 (2021), 76–84. Google ScholarDigital Library
- [56] . 2023. Amazon EC2 Auto Scaling, User Guide. Retrieved from https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-default-instance-warmup.htmlGoogle Scholar
- [57] . 2023. Amazon EC2 Instance Types. Retrieved from https://aws.amazon.com/ec2/instance-types/Google Scholar
- [58] . 2023. What Are Microservices? Retrieved from https://aws.amazon.com/microservices/Google Scholar
- [59] . 2008. A web server based edge detector implementation in FPGA. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI’08). IEEE Computer Society, 441–446. Google ScholarDigital Library
- [60] . 2015. Scalable 10Gbps TCP/IP stack architecture for reconfigurable hardware. In Proceedings of the 23rd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’15). IEEE Computer Society, 36–43. Google ScholarDigital Library
- [61] . 2016. Low-latency TCP/IP stack for data center applications. In Proceedings of the 26th International Conference on Field Programmable Logic and Applications (FPL’16). IEEE, 1–4. Google ScholarCross Ref
- [62] . 2018. \(\mathrm{\mu }\)Tune: Auto-tuned threading for OLDI microservices. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18), and (Eds.). USENIX Association, 177–194.Google Scholar
- [63] . 2018. Designing for FPGAs in the cloud. IEEE Des. Test 35, 1 (2018), 23–29. Google ScholarCross Ref
- [64] . 2017. Enabling flexible network FPGA clusters in a heterogeneous cloud data center. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA’17). ACM, 237–246.Google ScholarDigital Library
- [65] . 2015. Enabling FPGAs in hyperscale data centers. In Proceedings of the IEEE 12th International Conference on Ubiquitous Intelligence and Computing and IEEE 12th International Conference on Autonomic and Trusted Computing and IEEE 15th International Conference on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom’15). IEEE Computer Society, 1078–1086. Google ScholarCross Ref
- [66] . 2016. Network-attached FPGAs for data center applications. In Proceedings of the International Conference on Field-Programmable Technology (FPT’16). IEEE, 36–43. Google ScholarCross Ref
- [67] . 2022. Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393). Retrieved from https://docs.xilinx.com/r/en-US/ug1393-vitis-application-acceleration/Debugging-with-ChipScopeGoogle Scholar
- [68] . 2011. Grounding high efficiency cloud computing architecture: HW-SW co-design and implementation of a stand-alone web server on FPGA. In Proceedings of the 4th International Conference on the Applications of Digital Information and Web Technologies (ICADIWT’11). 124–129. Google ScholarCross Ref
- [69] . 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’10). USENIX Association.Google ScholarDigital Library
- [70] . 2022. Understanding data storage and ingestion for large-scale deep recommendation model training: Industrial product. In Proceedings of the 49th Annual International Symposium on Computer Architecture (ISCA’22), , , , and (Eds.). ACM, 1042–1057. Google ScholarDigital Library
- [71] . 2019. Introduction to microservice API patterns (MAP). In Proceedings of the 1st and 2nd International Conference on Microservices, (Microservices’17/’19),
OASIcs , Vol. 78. 4:1–4:17. Google ScholarCross Ref
Index Terms
- Strega: An HTTP Server for FPGAs
Recommendations
Hardware and software infrastructure to implement many-core systems in modern FPGAs
SBCCI '17: Proceedings of the 30th Symposium on Integrated Circuits and Systems Design: Chip on the SandsMany-core systems are increasingly popular in embedded systems due to their high-performance and flexibility to execute different workloads. These many-core systems provide a rich processing fabric but lack the flexibility to accelerate critical ...
Deploying Multi-tenant FPGAs within Linux-based Cloud Infrastructure
Cloud deployments now increasingly exploit Field-Programmable Gate Array (FPGA) accelerators as part of virtual instances. While cloud FPGAs are still essentially single-tenant, the growing demand for efficient hardware acceleration paves the way to FPGA ...
Efficient AES implementations on ASICs and FPGAs
AES'04: Proceedings of the 4th international conference on Advanced Encryption StandardIn this article, we present two AES hardware architectures: one for ASICs and one for FPGAs. Both architectures utilize the similarities of encryption and decryption to provide a high throughput using only a relatively small area. The presented ...
Comments