Skip to main content

Advertisement

Log in

Evaluating application performance and energy consumption on hybrid CPU+GPU architecture

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The High Performance Computing (HPC) community aimed for many years to increase performance regardless of energy consumption. Until the end of the decade, a next generation of HPC systems is expected to reach sustained performances of the order of exaflops. This requires many times more performance compared to the fastest supercomputers of today. Achieving this goal is unthinkable with current technology due to strict constraints on supplied power. Therefore, finding ways to improve energy efficiency become a main challenge on state-of-the-art research. The present paper investigates energy efficiency on heterogeneous CPU+GPU architectures using a scientific application from the agroforestry domain as a case-study. Differently from other works, our work evaluates how the workload of the application may affect energy efficiency on hybrid architectures. Results point out that the power supplier constraints depend also on the workload.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Barker, K., Davis, K., Hoisie, A., Kerbyson, D., Lang, M., Pakin, S., Sancho, J.: Using performance modeling to design large-scale systems. IEEE Comput. 42(11), 42–49 (2009)

    Article  Google Scholar 

  2. Beckman, P., Dally, B., Shainer, G., Dunning, T., Ahalt, S.C., Bernhardt, M.: On the road to exascale. Sci. Comput. World 116, 26–28 (2011)

    Google Scholar 

  3. Brodtkorb, A.R., Dyken, C., Hagen, T.R., Hjelmervik, J.M., Storaasli, O.O.: State-of-the-art in heterogeneous computing. Sci. Program. 18(1), 1–33 (2010)

    Google Scholar 

  4. Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., Hanrahan, P.: Brook for GPUS: stream computing on graphics hardware. In: ACM Transactions on Graphics (TOG), vol. 23, pp. 777–786. ACM, New York (2004)

    Google Scholar 

  5. Cameron, K.: A tale of two green lists. Computer 43(9), 86–88 (2010). doi:10.1109/MC.2010.246

    Article  Google Scholar 

  6. Dong, Y., Chen, J., Tang, T.: Power measurements and analyses of massive object storage system. In: 2010 10th IEEE International Conference on Computer and Information Technology (CIT 2010), pp. 1317–1322. IEEE, New York (2010)

    Chapter  Google Scholar 

  7. Dongarra, J., Beckman, P., Aerts, P., Cappello, F., Lippert, T., Matsuoka, S., Messina, P., Moore, T., Stevens, R., Trefethen, A., et al.: The international exascale software project: a call to cooperative action by the global high-performance community. Int. J. High Perform. Comput. Appl. 23(4), 309–322 (2009)

    Article  Google Scholar 

  8. Dongarra, J.J.: The Top500 list—TOP500 supercomputer sites (2011). http://www.top500.org/

  9. Doussan, C., Jouniaux, L., Thony, J.: Variations of self-potential and unsaturated water flow with time in sandy loam and clay loam soils. J. Hydrol. 267(3), 173–185 (2002)

    Article  Google Scholar 

  10. DRANETZ: Power Platform PP-4300. Disponivel em (2011). http://dranetz.com/old/powerplatform-pp4300

  11. Feng, W., Cameron, K.: The Green500 list: encouraging sustainable supercomputing. Computer 40(12), 50–55 (2007)

    Article  Google Scholar 

  12. Frachtenberg, E., Heydari, A., Li, H., Michael, A., Na, J., Nisbet, A., Sarti, P.: High-efficiency server design. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, p. 27. ACM, New York (2011)

    Google Scholar 

  13. Grochowski, E., Annavaram, M.: Energy per instruction trends in Intel microprocessors. Technol. Intel Mag. 4(3), 1–8 (2006)

    Google Scholar 

  14. Hsu, C., Feng, W., Archuleta, J.: Towards efficient supercomputing: a quest for the right metric. In: Proceedings 19th IEEE International Parallel and Distributed Processing Symposium, 2005, p. 8. IEEE, New York (2005)

    Google Scholar 

  15. Hsu, C.H., Feng, W.-C., Archuleta, J.S.: Towards efficient supercomputing: a quest for the right metric. In: Proc. 19th IEEE International Parallel & Distributed Processing Symposium, p. 8. Denver, Colorado, USA (2005). Technical report LA-UR05-0936

    Google Scholar 

  16. Jiao, Y., Lin, H., Balaji, P., Feng, W.: Power and performance characterization of computational kernels on the GPU. In: Green Computing and Communications (GreenCom), 2010 IEEE/ACM Int’l Conference on & Int’l Conference on Cyber, Physical and Social Computing (CPSCom), pp. 221–228. IEEE, New York (2010)

    Chapter  Google Scholar 

  17. Khairy, M., Mehlfuhrer, C., Rupp, M.: Boosting sphere decoding speed through graphic processing units. In: 2010 European Wireless Conference (EW), pp. 99–104. IEEE, New York (2010)

    Chapter  Google Scholar 

  18. Kogge, P.: The tops in flops. IEEE Spectr. 48(2), 44–50 (2011)

    Article  Google Scholar 

  19. Kogge, P., Bergman, K., Borkar, S., Campbell, D., Carson, W., Dally, W., Denneau, M., Franzon, P., Harrod, W., Hill, K., et al.: In: Exascale Computing Study: Technology Challenges in Achieving Exascale Systems, pp. 1–297 (2008)

    Google Scholar 

  20. Lee, V.W., Kim, C., Chhugani, J., Deisher, M., Kim, D., Nguyen, A.D., Satish, N., Smelyanskiy, M., Chennupaty, S., Hammarlund, P., Singhal, R., Dubey, P.: Debunking the 100x gpu vs. cpu myth: an evaluation of throughput computing on cpu and gpu. In: Proceedings of the 37th Annual International Symposium on Computer Architecture, ISCA’10, pp. 451–460. ACM, New York (2010). doi:10.1145/1815961.1816021

    Google Scholar 

  21. Liu, W., Du, Z., Xiao, Y., Bader, D., Xu, C.: A waterfall model to achieve energy efficient tasks mapping for large scale gpu clusters. In: 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), pp. 82–92. IEEE, New York (2011)

    Chapter  Google Scholar 

  22. Luk, C., Hong, S., Kim, H.: Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 45–55. ACM, New York (2009)

    Chapter  Google Scholar 

  23. Michalakes, J., Vachharajani, M.: Gpu acceleration of numerical weather prediction. Parallel Process. Lett. 18(04), 1–8 (2008). doi:10.1142/S0129626408003557. http://www.worldscinet.com/ppl/18/1804/S0129626408003557.html

    Article  MathSciNet  Google Scholar 

  24. Miyazaki, T.: Water flow in unsaturated soil in layered slopes. J. Hydrol. 102(1–4), 201–214 (1988)

    Article  Google Scholar 

  25. Miyazaki, T.: Water Flow in Soils. CRC Press, Boca Raton (2006)

    Google Scholar 

  26. NVIDIA: NVIDIA CUDA Compute Unified Device Architecture Programming Guide (2009)

  27. NVIDIA: Next Generation CUDA Compute Architecture: Fermi (2009)

  28. Panetta, J., Teixeira, T., de Souza Filho, P.R., da Cunha Finho, C.A., Sotelo, D., da Motta, F.M.R., Pinheiro, S.S., Junior, I.P., Rosa, A.L.R., Monnerat, L.R., Carneiro, L.T., de Albrecht, C.H.: Accelerating Kirchhoff migration by CPU and GPU cooperation. In: 21st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2009), pp. 26–32 (2009). doi:10.1109/SBAC-PAD.2009.29. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5336217

    Chapter  Google Scholar 

  29. Pawlowski, S.S.: Exascale science: the next frontier in high performance computing. In: The 24th International Conference on Supercomputing (ICS), 2010, p. 1 (2010)

    Google Scholar 

  30. Ren, D.Q., Suda, R.: Investigation on the power efficiency of multi-core and gpu processing element in large scale SIMD computation with CUDA. In: International Conference on Green Computing, pp. 309–316. IEEE, New York (2010)

    Chapter  Google Scholar 

  31. Schreier, P.: How cool are supercomputer? Sci. Comput. World 116, 22–24 (2011)

    Google Scholar 

  32. Shiers, J.: The worldwide lhc computing grid (worldwide lcg). Comput. Phys. Commun. 177(1–2), 219–223 (2007)

    Article  Google Scholar 

  33. Subramaniam, B., Feng, W.: Understanding power measurement implications in the Green500 list. In: Green Computing and Communications (GreenCom), 2010 IEEE/ACM Int’l Conference on & Int’l Conference on Cyber, Physical and Social Computing (CPSCom), pp. 245–251. IEEE Press, New York (2010)

    Chapter  Google Scholar 

  34. Suda, R., Aoki, T., Hirasawa, S., Nukada, A., Honda, H., Matsuoka, S.: Aspects of gpu for general purpose high performance computing. In: Proceedings of the 2009 Asia and South Pacific Design Automation Conference, pp. 216–223. IEEE Press, New York (2009)

    Chapter  Google Scholar 

  35. Tarantola, A.: Inverse Problem Theory and Methods for Model Parameter Estimation. SIAM, Philadelphia (2005)

    Book  MATH  Google Scholar 

  36. Tveito, A., Langtangen, H., Nielsen, B., Cai, X.: Parameter estimation and inverse problems. In: Elements of Scientific Computing, pp. 411–421 (2010)

    Chapter  Google Scholar 

  37. Valero, M.: Towards exaflop supercomputers. In: Conference Center of the University of Patras—High Performance Computing Academic Research Network (HPC-net) (2011)

    Google Scholar 

  38. Wang, G., Ren, X.: Power-efficient work distribution method for cpu-gpu heterogeneous system. In: International Symposium on Parallel and Distributed Processing with Applications, pp. 122–129. IEEE, New York (2010)

    Chapter  Google Scholar 

  39. Younge, A., von Laszewski, G., Wang, L., Lopez-Alarcon, S., Carithers, W.: Efficient resource management for cloud computing environments. In: International Conference on Green Computing, pp. 357–364. IEEE, New York (2010)

    Chapter  Google Scholar 

Download references

Acknowledgements

This work was partially supported by several Brazilian research agencies: CNPq, CAPES, FAPERGS and FINEP. We would like to thank these agencies, their support made this work possible. We also would like to thank all persons of the Parallel and Distributed Processing Group (GPPD) at Federal University of Rio Grande do Sul (UFRGS), their help and expertise were of great value. This research has been partially supported by CAPES-BRAZIL under grants 5854/11-3 and 5847/11-7. Work developed on the context of the associated international laboratory between UFRGS and Université de Grenoble—LICIA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edson Luiz Padoin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Padoin, E.L., Pilla, L.L., Boito, F.Z. et al. Evaluating application performance and energy consumption on hybrid CPU+GPU architecture. Cluster Comput 16, 511–525 (2013). https://doi.org/10.1007/s10586-012-0219-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-012-0219-6

Keywords

Navigation