Skip to main content

Impact of Noise on Scaling of Collectives: An Empirical Evaluation

  • Conference paper
High Performance Computing - HiPC 2006 (HiPC 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4297))

Included in the following conference series:

Abstract

It is increasingly becoming evident that operating system interference in the form of daemon activity and interrupts contribute significantly to performance degradation of parallel applications in large clusters. An earlier theoretical study has evaluated the impact of system noise on application performance for different noise distributions [1]. Our work complements the theoretical analysis by presenting an empirical study of noise in production clusters. We designed a parallel benchmark that was used on large clusters at SanDeigo Supercomputing Center for collecting noise related data. This data was fed to a simulator that predicts the performance of collective operations using the model of [1]. We report our comparison of the predicted and the observed performance. Additionally, the tools developed in the process have been instrumental in identifying anomalous nodes that could potentially be affecting application performance if undetected.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, S., Garg, R., Vishnoi, N.K.: The impact of noise on the scaling of collectives: A theoretical approach. In: Bader, D.A., Parashar, M., Sridhar, V., Prasanna, V.K. (eds.) HiPC 2005. LNCS, vol. 3769, pp. 280–289. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  2. Jones, T., Brenner, L., Fier, J.: Impacts of Operating Systems on the Scalability of Parallel Applications, Lawrence Livermore National Laboratory, Tech. Rep. UCRL-MI-202629 (March 2003)

    Google Scholar 

  3. Giosa, R., Petrini, F., Davis, K., Lebaillif-Delamare, F.: Analysis of System Overhead on Parallel Computers. In: IEEE International Symposium on Signal Processing and Information Technology (ISSPIT) (2004)

    Google Scholar 

  4. Petrini, F., Kerbyson, D.J., Pakin, S.: The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8192 Processors of ASCI Q. In: ACM Supercomputing (2003)

    Google Scholar 

  5. Tsafrir, D., Etsion, Y., Feitelson, D.G., Kirkpatrick, S.: System Noise, OS Clock Ticks, and Fine-grained Parallel Applications. In: ICS (2005)

    Google Scholar 

  6. Moreira, J., Franke, H., Chan, W., Fong, L., Jette, M., Yoo, A.: A Gang-Scheduling System for ASCI Blue-Pacific. In: International Conference on High performance Computing and Networking (1999)

    Google Scholar 

  7. Hori, A., Tezuka, H., Ishikawa, Y.: Highly Efficient Gang Scheduling Implementations. In: ACM/IEEE Conference on Supercomputing (1998)

    Google Scholar 

  8. Frachtenberg, E., Petrini, F., Fernandez, J., Pakin, S., Coll, S.: STORM: Lightning-Fast Resource Management. In: ACM/IEEE Conference on Supercomputing (2002)

    Google Scholar 

  9. DataStar Compute Resource at SDSC, [Online] Available: http://www.sdsc.edu/user_services/datastar/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Garg, R., De, P. (2006). Impact of Noise on Scaling of Collectives: An Empirical Evaluation. In: Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing - HiPC 2006. HiPC 2006. Lecture Notes in Computer Science, vol 4297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11945918_45

Download citation

  • DOI: https://doi.org/10.1007/11945918_45

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68039-0

  • Online ISBN: 978-3-540-68040-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics