Evaluating application performance and energy consumption on hybrid CPU+GPU architecture

Padoin, Edson Luiz; Pilla, Laércio Lima; Boito, Francieli Zanon; Kassick, Rodrigo Virote; Velho, Pedro; Navaux, Philippe O. A.

doi:10.1007/s10586-012-0219-6

Evaluating application performance and energy consumption on hybrid CPU+GPU architecture

Published: 30 June 2012

Volume 16, pages 511–525, (2013)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Edson Luiz Padoin^1,2,
Laércio Lima Pilla¹,
Francieli Zanon Boito¹,
Rodrigo Virote Kassick¹,
Pedro Velho¹ &
…
Philippe O. A. Navaux¹

866 Accesses
16 Citations
Explore all metrics

Abstract

The High Performance Computing (HPC) community aimed for many years to increase performance regardless of energy consumption. Until the end of the decade, a next generation of HPC systems is expected to reach sustained performances of the order of exaflops. This requires many times more performance compared to the fastest supercomputers of today. Achieving this goal is unthinkable with current technology due to strict constraints on supplied power. Therefore, finding ways to improve energy efficiency become a main challenge on state-of-the-art research. The present paper investigates energy efficiency on heterogeneous CPU+GPU architectures using a scientific application from the agroforestry domain as a case-study. Differently from other works, our work evaluates how the workload of the application may affect energy efficiency on hybrid architectures. Results point out that the power supplier constraints depend also on the workload.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Energy efficiency of load balancing for data-parallel applications in heterogeneous systems

Article 08 September 2016

Improving Performance and Energy Efficiency on OpenPower Systems Using Scalable Hardware-Software Co-design

Energy Efficient Frequency Scaling on GPUs in Heterogeneous HPC Systems

References

Barker, K., Davis, K., Hoisie, A., Kerbyson, D., Lang, M., Pakin, S., Sancho, J.: Using performance modeling to design large-scale systems. IEEE Comput. 42(11), 42–49 (2009)
Article Google Scholar
Beckman, P., Dally, B., Shainer, G., Dunning, T., Ahalt, S.C., Bernhardt, M.: On the road to exascale. Sci. Comput. World 116, 26–28 (2011)
Google Scholar
Brodtkorb, A.R., Dyken, C., Hagen, T.R., Hjelmervik, J.M., Storaasli, O.O.: State-of-the-art in heterogeneous computing. Sci. Program. 18(1), 1–33 (2010)
Google Scholar
Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for GPUS: stream computing on graphics hardware. In: ACM Transactions on Graphics (TOG), vol. 23, pp. 777–786. ACM, New York (2004)
Google Scholar
Cameron, K.: A tale of two green lists. Computer 43(9), 86–88 (2010). doi:10.1109/MC.2010.246
Article Google Scholar
Dong, Y., Chen, J., Tang, T.: Power measurements and analyses of massive object storage system. In: 2010 10th IEEE International Conference on Computer and Information Technology (CIT 2010), pp. 1317–1322. IEEE, New York (2010)
Chapter Google Scholar
Dongarra, J., Beckman, P., Aerts, P., Cappello, F., Lippert, T., Matsuoka, S., Messina, P., Moore, T., Stevens, R., Trefethen, A., et al.: The international exascale software project: a call to cooperative action by the global high-performance community. Int. J. High Perform. Comput. Appl. 23(4), 309–322 (2009)
Article Google Scholar
Dongarra, J.J.: The Top500 list—TOP500 supercomputer sites (2011). http://www.top500.org/
Doussan, C., Jouniaux, L., Thony, J.: Variations of self-potential and unsaturated water flow with time in sandy loam and clay loam soils. J. Hydrol. 267(3), 173–185 (2002)
Article Google Scholar
DRANETZ: Power Platform PP-4300. Disponivel em (2011). http://dranetz.com/old/powerplatform-pp4300
Feng, W., Cameron, K.: The Green500 list: encouraging sustainable supercomputing. Computer 40(12), 50–55 (2007)
Article Google Scholar
Frachtenberg, E., Heydari, A., Li, H., Michael, A., Na, J., Nisbet, A., Sarti, P.: High-efficiency server design. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, p. 27. ACM, New York (2011)
Google Scholar
Grochowski, E., Annavaram, M.: Energy per instruction trends in Intel microprocessors. Technol. Intel Mag. 4(3), 1–8 (2006)
Google Scholar
Hsu, C., Feng, W., Archuleta, J.: Towards efficient supercomputing: a quest for the right metric. In: Proceedings 19th IEEE International Parallel and Distributed Processing Symposium, 2005, p. 8. IEEE, New York (2005)
Google Scholar
Hsu, C.H., Feng, W.-C., Archuleta, J.S.: Towards efficient supercomputing: a quest for the right metric. In: Proc. 19th IEEE International Parallel & Distributed Processing Symposium, p. 8. Denver, Colorado, USA (2005). Technical report LA-UR05-0936
Google Scholar
Jiao, Y., Lin, H., Balaji, P., Feng, W.: Power and performance characterization of computational kernels on the GPU. In: Green Computing and Communications (GreenCom), 2010 IEEE/ACM Int’l Conference on & Int’l Conference on Cyber, Physical and Social Computing (CPSCom), pp. 221–228. IEEE, New York (2010)
Chapter Google Scholar
Khairy, M., Mehlfuhrer, C., Rupp, M.: Boosting sphere decoding speed through graphic processing units. In: 2010 European Wireless Conference (EW), pp. 99–104. IEEE, New York (2010)
Chapter Google Scholar
Kogge, P.: The tops in flops. IEEE Spectr. 48(2), 44–50 (2011)
Article Google Scholar
Kogge, P., Bergman, K., Borkar, S., Campbell, D., Carson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hill, K., et al.: In: Exascale Computing Study: Technology Challenges in Achieving Exascale Systems, pp. 1–297 (2008)
Google Scholar
Lee, V.W., Kim, C., Chhugani, J., Deisher, M., Kim, D., Nguyen, A.D., Satish, N., Smelyanskiy, M., Chennupaty, S., Hammarlund, P., Singhal, R., Dubey, P.: Debunking the 100x gpu vs. cpu myth: an evaluation of throughput computing on cpu and gpu. In: Proceedings of the 37th Annual International Symposium on Computer Architecture, ISCA’10, pp. 451–460. ACM, New York (2010). doi:10.1145/1815961.1816021
Google Scholar
Liu, W., Du, Z., Xiao, Y., Bader, D., Xu, C.: A waterfall model to achieve energy efficient tasks mapping for large scale gpu clusters. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp. 82–92. IEEE, New York (2011)
Chapter Google Scholar
Luk, C., Hong, S., Kim, H.: Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 45–55. ACM, New York (2009)
Chapter Google Scholar
Michalakes, J., Vachharajani, M.: Gpu acceleration of numerical weather prediction. Parallel Process. Lett. 18(04), 1–8 (2008). doi:10.1142/S0129626408003557. http://www.worldscinet.com/ppl/18/1804/S0129626408003557.html
Article MathSciNet Google Scholar
Miyazaki, T.: Water flow in unsaturated soil in layered slopes. J. Hydrol. 102(1–4), 201–214 (1988)
Article Google Scholar
Miyazaki, T.: Water Flow in Soils. CRC Press, Boca Raton (2006)
Google Scholar
NVIDIA: NVIDIA CUDA Compute Unified Device Architecture Programming Guide (2009)
NVIDIA: Next Generation CUDA Compute Architecture: Fermi (2009)
Panetta, J., Teixeira, T., de Souza Filho, P.R., da Cunha Finho, C.A., Sotelo, D., da Motta, F.M.R., Pinheiro, S.S., Junior, I.P., Rosa, A.L.R., Monnerat, L.R., Carneiro, L.T., de Albrecht, C.H.: Accelerating Kirchhoff migration by CPU and GPU cooperation. In: 21st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2009), pp. 26–32 (2009). doi:10.1109/SBAC-PAD.2009.29. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5336217
Chapter Google Scholar
Pawlowski, S.S.: Exascale science: the next frontier in high performance computing. In: The 24th International Conference on Supercomputing (ICS), 2010, p. 1 (2010)
Google Scholar
Ren, D.Q., Suda, R.: Investigation on the power efficiency of multi-core and gpu processing element in large scale SIMD computation with CUDA. In: International Conference on Green Computing, pp. 309–316. IEEE, New York (2010)
Chapter Google Scholar
Schreier, P.: How cool are supercomputer? Sci. Comput. World 116, 22–24 (2011)
Google Scholar
Shiers, J.: The worldwide lhc computing grid (worldwide lcg). Comput. Phys. Commun. 177(1–2), 219–223 (2007)
Article Google Scholar
Subramaniam, B., Feng, W.: Understanding power measurement implications in the Green500 list. In: Green Computing and Communications (GreenCom), 2010 IEEE/ACM Int’l Conference on & Int’l Conference on Cyber, Physical and Social Computing (CPSCom), pp. 245–251. IEEE Press, New York (2010)
Chapter Google Scholar
Suda, R., Aoki, T., Hirasawa, S., Nukada, A., Honda, H., Matsuoka, S.: Aspects of gpu for general purpose high performance computing. In: Proceedings of the 2009 Asia and South Pacific Design Automation Conference, pp. 216–223. IEEE Press, New York (2009)
Chapter Google Scholar
Tarantola, A.: Inverse Problem Theory and Methods for Model Parameter Estimation. SIAM, Philadelphia (2005)
Book MATH Google Scholar
Tveito, A., Langtangen, H., Nielsen, B., Cai, X.: Parameter estimation and inverse problems. In: Elements of Scientific Computing, pp. 411–421 (2010)
Chapter Google Scholar
Valero, M.: Towards exaflop supercomputers. In: Conference Center of the University of Patras—High Performance Computing Academic Research Network (HPC-net) (2011)
Google Scholar
Wang, G., Ren, X.: Power-efficient work distribution method for cpu-gpu heterogeneous system. In: International Symposium on Parallel and Distributed Processing with Applications, pp. 122–129. IEEE, New York (2010)
Chapter Google Scholar
Younge, A., von Laszewski, G., Wang, L., Lopez-Alarcon, S., Carithers, W.: Efficient resource management for cloud computing environments. In: International Conference on Green Computing, pp. 357–364. IEEE, New York (2010)
Chapter Google Scholar

Download references

Acknowledgements

This work was partially supported by several Brazilian research agencies: CNPq, CAPES, FAPERGS and FINEP. We would like to thank these agencies, their support made this work possible. We also would like to thank all persons of the Parallel and Distributed Processing Group (GPPD) at Federal University of Rio Grande do Sul (UFRGS), their help and expertise were of great value. This research has been partially supported by CAPES-BRAZIL under grants 5854/11-3 and 5847/11-7. Work developed on the context of the associated international laboratory between UFRGS and Université de Grenoble—LICIA.

Author information

Authors and Affiliations

Institute of Informatics, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, RS, Brazil
Edson Luiz Padoin, Laércio Lima Pilla, Francieli Zanon Boito, Rodrigo Virote Kassick, Pedro Velho & Philippe O. A. Navaux
Department of Exact Sciences and Engineering, Regional University of Northwest of Rio Grande do Sul (UNIJUI), Ijuí, RS, Brazil
Edson Luiz Padoin

Authors

Edson Luiz Padoin
View author publications
You can also search for this author in PubMed Google Scholar
Laércio Lima Pilla
View author publications
You can also search for this author in PubMed Google Scholar
Francieli Zanon Boito
View author publications
You can also search for this author in PubMed Google Scholar
Rodrigo Virote Kassick
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Velho
View author publications
You can also search for this author in PubMed Google Scholar
Philippe O. A. Navaux
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edson Luiz Padoin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Padoin, E.L., Pilla, L.L., Boito, F.Z. et al. Evaluating application performance and energy consumption on hybrid CPU+GPU architecture. Cluster Comput 16, 511–525 (2013). https://doi.org/10.1007/s10586-012-0219-6

Download citation

Received: 04 January 2012
Accepted: 14 June 2012
Published: 30 June 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10586-012-0219-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating application performance and energy consumption on hybrid CPU+GPU architecture

Abstract

Access this article

Similar content being viewed by others

Energy efficiency of load balancing for data-parallel applications in heterogeneous systems

Improving Performance and Energy Efficiency on OpenPower Systems Using Scalable Hardware-Software Co-design

Energy Efficient Frequency Scaling on GPUs in Heterogeneous HPC Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evaluating application performance and energy consumption on hybrid CPU+GPU architecture

Abstract

Access this article

Similar content being viewed by others

Energy efficiency of load balancing for data-parallel applications in heterogeneous systems

Improving Performance and Energy Efficiency on OpenPower Systems Using Scalable Hardware-Software Co-design

Energy Efficient Frequency Scaling on GPUs in Heterogeneous HPC Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation