skip to main content
10.1145/2541940.2541961acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Q100: the architecture and design of a database processing unit

Published:24 February 2014Publication History

ABSTRACT

In this paper, we propose Database Processing Units, or DPUs, a class of domain-specific database processors that can efficiently handle database applications. As a proof of concept, we present the instruction set architecture, microarchitecture, and hardware implementation of one DPU, called Q100. The Q100 has a collection of heterogeneous ASIC tiles that process relational tables and columns quickly and energy-efficiently. The architecture uses coarse grained in- structions that manipulate streams of data, thereby maximizing pipeline and data parallelism, and minimizing the need to time multiplex the accelerator tiles and spill inter- mediate results to memory. This work explores a Q100 de- sign space of 150 configurations, selecting three for further analysis: a small, power-conscious implementation, a high- performance implementation, and a balanced design that maximizes performance per Watt. We then demonstrate that the power-conscious Q100 handles the TPC-H queries with three orders of magnitude less energy than a state of the art software DBMS, while the performance-oriented design out- performs the same DBMS by 70X.

Skip Supplemental Material Section

Supplemental Material

References

  1. Kx systems. http://kx.com/_papers/Kx_White_Paper-2013-02c.pdf.Google ScholarGoogle Scholar
  2. Sybase IQ. http://www.sybase.com/products/archivedproducts/sybaseiq.Google ScholarGoogle Scholar
  3. D. J. Abadi, P. A. Boncz, and S. Harizopoulos. Column-oriented database systems. VLDB, August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. J. Abadi, D. S. Myers, D. J. DeWitt, and S. R. Madden. Materialization strategies in a column-oriented dbms. In ICDE, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  5. AMD/ATI. http://www.amd.com.Google ScholarGoogle Scholar
  6. P. A. Boncz, M. Zukowski, and N. Nes. Monetdb/x100: Hyper-pipelining query execution. In CIDR, 2005.Google ScholarGoogle Scholar
  7. Haran Boral and David J. DeWitt. Database machines: an idea whose time has passed? In IWDM, 1983.Google ScholarGoogle Scholar
  8. Centrum Wiskunde and Informatica. http://www.monetdb.org.Google ScholarGoogle Scholar
  9. E. S. Chung, J. D. Davis, and J. Lee. Linqits: Big data on little clients. In ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Intel Corporation. Intel 64® and IA-32 architectures software developer's manual. http://download.intel.com/products/processor/manual/253669.pdf.Google ScholarGoogle Scholar
  11. Teradata Corporation. http://www.teradata.com.Google ScholarGoogle Scholar
  12. J. B. Dennis. Advanced topics in data-flow computing. Prentice-Hall, 1991.Google ScholarGoogle Scholar
  13. M. Gebhart, B. A. Maher, K. E. Coons, J. Diamond, P. Gratz, M. Marino, N. Ranganathan, B. Robatmili, A. Smith, J. Burrill, S. W. Keckler, D. Burger, and K. S. McKinley. An evaluation of the TRIPS computer system. In ASPLOS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Gold, A. Ailamaki, L. Huston, and B. Falsafi. Accelerating database operators using a network processor. In DaMoN, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. N. K. Govindaraju, B. Lloyd, W. Wang, M. Lin, and D. Manocha. Fast computation of database operations using graphics processors. In SIGGRAPH, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Graefe and W. J. McKenna. The volcano optimizer generator: Extensivility and efficient search. In ICDE, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J.R. Gurd, C. C. Kirkham, and I. Watson. The manchester prototype dataflow computer. Communications of the ACM, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz. Understanding sources of inefficiency in general-purpose chips. In ISCA, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Hicks, D. Chiou, B. S. Ang, and Arvind. Performance studies of ld on the monsoon dataflow system. 1993.Google ScholarGoogle Scholar
  20. D. Howard, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le. Rapl: memory power estimateion and capping. In ISLPED, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jason Howard, Saurabh Dighe, Yatin Hoskote, Sriram R. Vangal, David Finan, Gregory Ruhl, David Jenkins, Howard Wilson, Nitin Borkar, Gerhard Schrom, Fabric Pailet, Shailendra Jain, Tiju Jacob, Satish Yada, Sraven Marella, Praveen Salihundam, Vasantha Erraguntla, Michael Konow, Michael Riepen, Guido Droege, Joerg Lindemann, Matthias Gries, Thomas Apel, Kersten Henriss, Tor Lund-Larsen, Sebastian Steibl, Shekhar Borkar, Vivek De, Rob F. Van der Wijngaart, and Timothy G. Mattson. A 48-core IA-32 message-passing processor with DVFS in 45nm CMOS. In ISSCC, pages 108--109, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  22. IBM. IBM Netezza Data Warehouse Appliance. http://www-01.ibm.com/software/data/netezza/.Google ScholarGoogle Scholar
  23. IDC Research. IDC's most recent worldwide Big Data technology and services market forecast. http://www.idc.com/getdoc.jsp?containerId=prUS23355112.Google ScholarGoogle Scholar
  24. S. Idreos, F. Groffen, N. Nes, S. Manegold, K. S. Mullender, and M. L. Kersten. Monetdb: Two decades of research in column-oriented database architectures. Data Engineering Bulletin, 2012.Google ScholarGoogle Scholar
  25. Intel Corporation. Intel Xeon Processor E5--2430, 2012. http://ark.intel.com/products/64616/Intel-Xeon-Processor-E5--2430--(15M-Cache-2_20-GHz-7_20-GTs-Intel-QPI).Google ScholarGoogle Scholar
  26. M. F. Ionescu and K. E. Schauser. Optimizing parallel bitonic sort. In IPDPS, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  27. A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandiver, L. Doshi, and C. Bear. The vertica analytic database: C-store 7 years later. In VLDB, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. McAfee and E. Brynjolfsson. Big Data: The management revolution. Harvard Business Review, October 2012.Google ScholarGoogle Scholar
  29. R. Muller and J. Teubner. FPGAs: A new point in the database design space, 2010. EDBT Tutorial. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. NVIDIA. http://www.nvidia.com.Google ScholarGoogle Scholar
  31. A. Parashar, M. Pellauer, M. Adler, B. Ahsan, N. Crago, D. Lustig, V. Pavlov, A. Zhai, M. Gambhir, A. Jaleel, R. Allmon, R. Rayess, and J. Emer. Triggered instructions: A control paradigm for spatially-programmed architectures. In ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. W. Qadeer, R. Hameed, O. Shacham, P. Venkatesan, C. Kozyrakis, and M. A. Horowitz. Convolution engine: Balancing efficiency and flexibility in specialized computing. In ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. V. Salapura, T. Karkhanis, P. Nagpurkar, and J. Moreira. Accelerating business analytics applications. In HPCA, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, P. O'Neil, A. Rasin, N. Tran, and S. Zdonik. C-store: a column-oriented dbms. In VLDB, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. S. Swanson, A. Schwerin, M. Mercaldi, A. Petersen, A. Putnam, K. Michelson, M. Oskin, and S. J. Eggers. The wavescalar architecture. ACM Trans. Comp. Syst., 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Synopsys, Inc. 32/28nm Generic Library for IC Design, Design Compiler, IC Compiler. http://www.synopsys.com.Google ScholarGoogle Scholar
  37. W. Thies and S. Amarasinghe. An empirical characterization of stream programs and its implications for language and compiler design. In PACT, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Iyer, A. Singh, T. Jacob, S. Jain, S. Venkataraman, Y. Hoskote, and N. Borkar. An 80-tile 1.28TFLOPS network-on-chip in 65nm CMOS. In ISSCC, February 2007.Google ScholarGoogle ScholarCross RefCross Ref
  39. L. Wu, R. J. Barker, M. A. Kim, and K. A. Ross. Navigating big data with high-throughput, energy-efficient data partitioning. In ISCA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. M. Zukowski and P. Boncz. Vectorwise: Beyond column stores. Data Engineering Bulletin, 2012.Google ScholarGoogle Scholar

Index Terms

  1. Q100: the architecture and design of a database processing unit

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASPLOS '14: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
        February 2014
        780 pages
        ISBN:9781450323055
        DOI:10.1145/2541940

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 February 2014

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ASPLOS '14 Paper Acceptance Rate49of217submissions,23%Overall Acceptance Rate535of2,713submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader