Published February 21, 2022 | Version v1
Report Open

HPC for Urgent Decision-Making

  • 1. FORTH
  • 2. CEA
  • 3. KTH
  • 4. INAF
  • 5. Scapos AG

Description

Emerging use cases from incident response planning and broad-scope European initiatives (e.g. Destination Earth [1,2], European Green Deal and Digital Package [21]) are expected to require federated, distributed infrastructures combining computing and data platforms. These will provide elasticity enabling users to build applications and integrate data for thematic specialisation and decision support, within ever shortening response time windows.

For prompt and, in particular, for urgent decision support, the conventional usage modes of HPC centres is not adequate: these rely on relatively long-term arrangements for time-scheduled exclusive use of HPC resources, and enforce well-established yet time-consuming policies for granting access. In urgent decision support scenarios, managers or members of incident response teams must initiate processing and control the resources required based on their real-time judgement on how a complex situation evolves over time. This circle of clients is distinct from the regular users of HPC centres, and they must interact with HPC workflows on-demand and in real-time, while engaging significant HPC and data processing resources in or across HPC centres.

This white paper considers the technical implications of supporting urgent decisions through establishing flexible usage modes for computing, analytics and AI/ML-based applications using HPC and large, dynamic assets.

The target decision support use cases will involve ensembles of jobs, data-staging to support workflows, and interactions with services/facilities external to HPC systems/centres. Our analysis identifies the need for flexible and interactive access to HPC resources, particularly in the context of dynamic workflows processing large datasets. This poses several technical and organisational challenges: short-notice secure access to HPC and data resources, dynamic resource allocation and scheduling, coordination of resource managers, support for data-intensive workflow (including data staging on node-local storage), preemption of already running workloads and interactive steering of simulations. Federation of services and resources across multiple sites will help to increase availability, provide elasticity for time-varying resource needs and enable leverage of data locality.

Notes

The authors would like to thank Maria S. Perez (Professor at Universidad Politécnica de Madrid, Spain) for her insightful critique on earlier drafts of this whitepaper.

Files

ETP4HPC_WP_urgent-decision-making_20220217.pdf

Files (2.4 MB)

Name Size Download all
md5:fbcd143406007195ea84bc3b07f4b13c
2.4 MB Preview Download

Additional details

Funding

PRACE-6IP – PRACE 6th Implementation Phase Project 823767
European Commission
ESCAPE – European Science Cluster of Astronomy & Particle physics ESFRI research infrastructures 824064
European Commission
DEEP-SEA – DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES 955606
European Commission

References

  • [1] Destination Earth (DestinE) initiative. https://ec.europa.eu/digital-single-market/en/destination-earth-destine
  • [2] Destination Earth: Use Cases Analysis, JRC Technical Report JRC122456, 2020. https://publications.jrc.ec.europa.eu/repository/handle/JRC122456
  • [3] Wilkinson MD, Dumontier M, Aalbersberg IJ, et al.. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016 Mar 15;3:160018. doi: 10.1038/sdata.2016.18. Erratum in: Sci Data. 2019 Mar 19;6(1):6. PMID: 26978244; PMCID: PMC4792175.
  • [4] N. Brown, R. Nash, G. Gibb, B. Prodan, M. Kontak, V. Olshevsky, and W. Der Chien, "The role of interactive supercomputing in using HPC for urgent decision making", in Proceedings of the International Conference on High Performance Computing. Springer, 2019, pp. 528–540.
  • [5] G. Gibb, R. Nash, N. Brown and B. Prodan, "The Technologies Required for Fusing HPC and Real-Time Data to Support Urgent Computing", in Proceedings of the 2019 IEEE/ACM Workshop on HPC for Urgent Decision Making (UrgentHPC), 2019, pp. 24-34.
  • [6] Earth System Modeling Framework : https://earthsystemmodeling.org/
  • [7] T. C. Schulthess, P. Bauer, N. Wedi, O. Fuhrer, T. Hoefler and C. Schär, "Reflecting on the Goal and Baseline for Exascale Computing: A Roadmap Based on Weather and Climate Simulations," in Computing in Science & Engineering, vol. 21, no. 1, pp. 30-41, 1 Jan.-Feb. 2019, doi: 10.1109/MCSE.2018.2888788.
  • [8] Baker, D.N., Erickson, P.J., Fennell, J.F. et al. Space Weather Effects in the Earth's Radiation Belts. Space Sci Rev 214, 17 (2018). https://doi.org/10.1007/s11214-017-0452-10
  • [9] R. Kube et al., "Near real-time analysis of big fusion data on HPC systems," 2020 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC), 2020, pp. 55-63, doi: 10.1109/UrgentHPC51945.2020.00012.
  • [10] A. Kremin, S. Bailey, J. Guy, T. Kisner and K. Zhang, "Rapid Processing of Astronomical Data for the Dark Energy Spectroscopic Instrument," 2020 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC), 2020, pp. 1-9, doi: 10.1109/UrgentHPC51945.2020.00006.
  • [11] Jiang, M., Bu, C., Zeng, J. et al. Applications and challenges of high performance computing in genomics. CCF Trans. HPC (2021). https://doi.org/10.1007/s42514-021-00081-w
  • [12] CISCO 2020, Global Network Trends Report, Tech. rep., CISCO. URL https://www.cisco.com/c/dam/m/en_us/solutions/enterprise-networks/ networking-report/files/GLBL-ENG_NB-06_0_NA_RPT_PDF_ MOFU-no-NetworkingTrendsReport-NB_rpten018612_5.pdf
  • [13] Asch M, Moore T, Badia R, et al. Big data and extreme-scale computing: Pathways to Convergence-Toward a shaping strategy for a future software and data ecosystem for scientific inquiry. The International Journal of High Performance Computing Applications. 2018;32(4):435-479. doi:10.1177/1094342018778123
  • [14] E.Yamasaki, 2012, What We Can Learn From Japan's Early Earthquake Warning System, Momentum: Volume 1: Issue 1, Article 2.
  • [15] F. Løvholt, S. Lorito, J. Macias, M. Volpe, J. Selva and S. Gibbons, "Urgent Tsunami Computing," 2019 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC), 2019, pp. 45-50, doi: 10.1109/UrgentHPC49580.2019.00011.
  • [16] Siew Hoon Leong, Dieter Kranzlmüller, "Towards a General Definition of Urgent Computing," Procedia Computer Science, Volume 51, 2015, https://doi.org/10.1016/j.procs.2015.05.402.
  • [17] Tzachor, A., Whittlestone, J., Sundaram, L. et al. Artificial intelligence in a crisis needs ethics with urgency. Nat Mach Intell 2, 365–366 (2020). https://doi.org/10.1038/s42256-020-0195-0
  • [18] Chen, N., Liu, W., Bai, R. et al. Application of computational intelligence technologies in emergency management: a literature review. Artif Intell Rev 52, 2131–2168 (2019). https://doi.org/10.1007/s10462-017-9589-8
  • [19] D. Elia, S. Fiore and G. Aloisio, "Towards HPC and Big Data Analytics Convergence: Design and Experimental Evaluation of a HPDA Framework for eScience at Scale," in IEEE Access, vol. 9, pp. 73307-73326, 2021. https://doi.org/10.1109/ACCESS.2021.3079139
  • [20] European High Performance Computing Joint Undertaking (EuroHPC JU). https://eurohpc-ju.europa.eu
  • [21] A European Green Deal. https://ec.europa.eu/info/strategy/priorities-2019-2024/european-green-deal_en
  • [22] R. Roscher, B. Bohn, M. F. Duarte and J. Garcke, "Explainable Machine Learning for Scientific Insights and Discoveries," in IEEE Access, vol. 8, pp. 42200-42216, 2020, doi: 10.1109/ACCESS.2020.2976199.
  • [23] Strategic Research and Innovation Agenda of the European Open Science Cloud (EOSC), Feb. 2021. https://www.eosc.eu/sites/default/files/EOSC-SRIA-V1.0_15Feb2021.pdf