skip to main content
10.1145/3526064.3534112acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Predicting Slow Network Transfers in Scientific Computing

Authors Info & Claims
Published:27 June 2022Publication History

ABSTRACT

Data access throughput is one of the key performance metrics in scientific computing, particularly for distributed data-intensive applications. While there has been a body of studies focusing on elephant connections that consume a significant fraction of network bandwidth, this study focuses on predicting slow connections that create bottlenecks in distributed workflows. In this study, we analyze network traffic logs collected between January 2019 and May 2021 at National Energy Research Scientific Computing Center (NERSC). Based on the observed patterns from this data collection, we define a set of features to be used for identifying low-performing data transfers. Through extensive feature engineering and feature selection, we identify a number of new features to significantly enhance the prediction performance. With these new features, even the relatively simple decision tree model could predict slow connections with a F1 score as high as 0.945.

References

  1. A Alekseev, A Kiryanov, A Klimentov, T Korchuganova, V Mitsyn, D Oleynik, A Smirnov, S Smirnov, and A Zarochentsev. 2020. Scientific Data Lake for High Luminosity LHC project and other data-intensive particle and astro-particle physics experiments. In Journal of Physics: Conference Series, Vol. 1690. 012166.Google ScholarGoogle ScholarCross RefCross Ref
  2. Ran Ben Basat, Gil Einziger, Roy Friedman, and Yaron Kassner. 2017. Optimal elephant flow detection. In IEEE INFOCOM 2017-IEEE Conference on Computer Communications. IEEE, 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  3. Thomas Beermann, Olga Chuchuk, Alessandro Di Girolamo, Maria Grigorieva, Alexei Klimentov, Mario Lassnig, Markus Schulz, Andrea Sciaba, and Eugeny Tretyakov. 2021. Methods of Data Popularity Evaluation in the ATLAS Experiment at the LHC. In EPJ Web of Conferences, Vol. 251. EDP Sciences, 02013.Google ScholarGoogle ScholarCross RefCross Ref
  4. Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 785--794.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Anshuman Chhabra and Mariam Kiran. 2017. Classifying elephant and mice flows in high-speed scientific networks. Proc. INDIS (2017), 1--8.Google ScholarGoogle Scholar
  6. Bjoern Enders, Debbie Bard, Cory Snavely, Lisa Gerhardt, Jason Lee, Becci Totzke, Katie Antypas, Suren Byna, Ravi Cheema, Shreyas Cholia, et al. 2020. Cross-facility science with the superfacility project at LBNL. In 2020 IEEE/ACM 2nd Workshop on Extreme-scale Experiment-in-the-Loop Computing (XLOOP). 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  7. Alessandro Finamore, Marco Mellia, Michela Meo, Maurizio M Munafo, Politecnico Di Torino, and Dario Rossi. 2011. Experiences of internet traffic monitoring with tstat. IEEE Network, Vol. 25, 3 (2011), 8--14.Google ScholarGoogle ScholarCross RefCross Ref
  8. Rajkumar Kettimuthu, Zhengchun Liu, Ian Foster, Peter H Beckman, Alex Sim, Kesheng Wu, Wei-keng Liao, Qiao Kang, Ankit Agrawal, and Alok Choudhary. 2018. Towards autonomic science infrastructure: Architecture, limitations, and open issues. In Proceedings of the 1st International Workshop on Autonomous Infrastructure for Science. 1--9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Zhenlong Li, Qunying Huang, Yuqin Jiang, and Fei Hu. 2020. SOVAS: a scalable online visual analytic system for big climate data analysis. International Journal of Geographical Information Science, Vol. 34, 6 (2020), 1188--1209.Google ScholarGoogle ScholarCross RefCross Ref
  10. Albert Mestres, Alberto Rodriguez-Natal, Josep Carner, Pere Barlet-Ros, Eduard Alarcón, Marc Solé, Victor Muntés-Mulero, David Meyer, Sharon Barkai, Mike J Hibbett, et al. 2017. Knowledge-defined networking. ACM SIGCOMM Computer Communication Review, Vol. 47, 3 (2017), 2--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M Nakashima, A Sim, and J Kim. 2020. Evaluation of Deep Learning Models for Network Performance Prediction for Scientific Facilities. In Proceedings of the 3rd International Workshop on Systems and Network Telemetry and Analytics. 53--56.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Makiya Nakashima, Alex Sim, Youngsoo Kim, Jonghyun Kim, and Jinoh Kim. 2021. Automated feature selection for anomaly detection in network traffic data. ACM Transactions on Management Information Systems (TMIS), Vol. 12, 3 (2021), 1--28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Taylor Reiter, Phillip T Brooks, Luiz Irber, Shannon EK Joslin, Charles M Reid, Camille Scott, C Titus Brown, and N Tessa Pierce-Ward. 2021. Streamlining data-intensive biology with workflow systems. GigaScience, Vol. 10, 1 (2021), giaa140.Google ScholarGoogle Scholar
  14. Oleg Sukhoroslov. 2021. Toward efficient execution of data-intensive workflows. The Journal of Supercomputing, Vol. 77, 8 (2021), 7989--8012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Astha Syal, Alina Lazar, Jinoh Kim, Alex Sim, and Kesheng Wu. 2019. Automatic detection of network traffic anomalies and changes. In Proceedings of the ACM Workshop on Systems and Network Telemetry and Analytics. 3--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Benjamin A Weaver, Michael R Blanton, Jon Brinkmann, Joel R Brownstein, and Fritz Stauffer. 2015. The Sloan digital sky survey data transfer infrastructure. Publications of the Astronomical Society of the Pacific, Vol. 127, 950 (2015), 397.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Predicting Slow Network Transfers in Scientific Computing

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SNTA '22: Fifth International Workshop on Systems and Network Telemetry and Analytics
              June 2022
              62 pages
              ISBN:9781450393157
              DOI:10.1145/3526064

              Copyright © 2022 ACM

              Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 27 June 2022

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate22of106submissions,21%

              Upcoming Conference

            • Article Metrics

              • Downloads (Last 12 months)12
              • Downloads (Last 6 weeks)4

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader