Batch Solution of Small PDEs with the OPS DSL

Reguly, Istvan Z.; Moore, Branden; Schmielau, Tim; du Toit, Jacques; Mudalige, Gihan R.

doi:10.1007/978-3-030-34356-9_12

Istvan Z. Reguly^12,13,
Branden Moore¹⁴,
Tim Schmielau¹⁴,
Jacques du Toit¹⁴ &
…
Gihan R. Mudalige¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11887))

Included in the following conference series:

International Conference on High Performance Computing

5885 Accesses
2 Citations

Abstract

In this paper we discuss the challenges and optimisations opportunities when solving a large number of small, equally sized discretised PDEs on regular grids. We present an extension of the OPS (Oxford Parallel library for Structured meshes) embedded Domain Specific Language, and show how support can be added for solving multiple systems, and how OPS makes it easy to deploy a variety of transformations and optimisations. The new capabilities in OPS allow to automatically apply data structure transformations, as well as execution schedule transformations to deliver high performance on a variety of hardware platforms. We evaluate our work on an industrially representative finance simulation on Intel CPUs, as well as NVIDIA GPUs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

OPS Library (2014). https://github.com/OP-DSL/OPS
Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: SC’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE (2012)
Google Scholar
Carter Edwards, H., Trott, C.R., Sunderland, D.: Kokkos. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014). https://doi.org/10.1016/j.jpdc.2014.07.003
Article Google Scholar
Chandra, R., Dagum, L., Kohr, D., Menon, R., Maydan, D., McDonald, J.: Parallel Programming in OpenMP. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Deakin, T., Price, J., Martineau, M., McIntosh-Smith, S.: Evaluating attainable memory bandwidth of parallel programming models via babelstream. Int. J. Comput. Sci. Eng. 17(3), 247–262 (2018)
Google Scholar
Gropp, W., Thakur, R., Lusk, E.: Using MPI-2: Advanced Features of the Message Passing Interface. MIT press, Cambridge (1999)
Book Google Scholar
Hornung, R.D., Keasler, J.A.: The RAJA portability layer: Overview and status. Technical report, Lawrence Livermore National Lab. (LLNL) (9 2014). https://doi.org/10.2172/1169830
Hundsdorfer, W.: Accuracy and stability of splitting with stabilizing corrections. Appl. Numer. Math. 42(1–3), 213–233 (2002)
Article MathSciNet Google Scholar
In’t Hout, K., Welfert, B.: Stability of adi schemes applied to convection-diffusion equations with mixed derivative terms. Appl. Numer. Math. 57(1), 19–35 (2007)
Article MathSciNet Google Scholar
In’t Hout, K., Welfert, B.: Unconditional stability of second-order adi schemes applied to multi-dimensional diffusion equations with mixed derivative terms. Appl. Numer. Math. 59(3–4), 677–692 (2009)
Article MathSciNet Google Scholar
Jammy, S.P., Mudalige, G.R., Reguly, I.Z., Sandham, N.D., Giles, M.: Block-structured compressible navier-stokes solution using the ops high-level abstraction. Int. J. Comput. Fluid Dyn. 30(6), 450–454 (2016). https://doi.org/10.1080/10618562.2016.1243663
Article MathSciNet Google Scholar
Kronawitter, S., Kuckuk, S., Köstler, H., Lengauer, C.: Automatic data layout transformations in the exastencils code generator. Mod. Phys. Lett. A 28(03), 1850009 (2018)
MathSciNet Google Scholar
László, E., Giles, M., Appleyard, J.: Manycore algorithms for batch scalar and block tridiagonal solvers. ACM Trans. Math. Softw. 42(4), 31:1–31:36 (2016). https://doi.org/10.1145/2830568. http://doi.acm.org/10.1145/2830568
Article MathSciNet MATH Google Scholar
MacNeice, P., Olson, K.M., Mobarry, C., De Fainchtein, R., Packer, C.: Paramesh: a parallel adaptive mesh refinement community toolkit. Comput. Phys. Commun. 126(3), 330–354 (2000)
Article Google Scholar
Mudalige, G.R., Reguly, I.Z., Giles, M.B., Mallinson, A.C., Gaudin, W.P., Herdman, J.A.: Performance analysis of a high-level abstractions-based hydrocode on future computing systems. In: Jarvis, S.A., Wright, S.A., Hammond, S.D. (eds.) PMBS 2014. LNCS, vol. 8966, pp. 85–104. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17248-4_5
Chapter Google Scholar
Nath, R., Tomov, S., Dongarra, J.: An improved magma gemm for fermi graphics processing units. Int. J. High Perform. Comput. Appl. 24(4), 511–515 (2010)
Article Google Scholar
Nvidia, C.: Programming guide (2010)
Google Scholar
Reguly, I.Z., Mudalige, G.R., Giles, M.B.: Loop tiling in large-scale stencil codes at run-time with OPS. IEEE Trans. Parallel Distrib. Syst. 29(4), 873–886 (2018). https://doi.org/10.1109/TPDS.2017.2778161
Article Google Scholar
Reguly, I.Z., Mudalige, G.R., Giles, M.B., Curran, D., McIntosh-Smith, S.: The ops domain specific abstraction for multi-block structured grid computations. In: 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, pp. 58–67, November 2014. https://doi.org/10.1109/WOLFHPC.2014.7
Siklosi, B., Reguly, I.Z., Mudalige, G.R.: Heterogeneous cpu-gpu execution of stencil applications. In: 2018 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 71–80, November 2018. https://doi.org/10.1109/P3HPC.2018.00010
Stone, J.E., Gohara, D., Shi, G.: Opencl: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66 (2010)
Article Google Scholar
Tataru, G., Fisher, T.: Stochastic local volatility. Quantitative Development Group, Bloomberg Version 1(February 5) (2010)
Google Scholar
Verwer, J.G., Spee, E.J., Blom, J.G., Hundsdorfer, W.: A second-order rosenbrock method applied to photochemical dispersion problems. SIAM J. Sci. Comput. 20(4), 1456–1480 (1999)
Article MathSciNet Google Scholar
Wang, H.: A parallel method for tridiagonal equations. ACM Trans. Math. Software (TOMS) 7(2), 170–183 (1981)
Article MathSciNet Google Scholar
Wyns, M., Du Toit, J.: A finite volume-alternating direction implicit approach for the calibration of stochastic local volatility models. Int. J. Comput. Math. 94(11), 2239–2267 (2017)
Article MathSciNet Google Scholar
Zingale, M., et al.: Meeting the challenges of modeling astrophysical thermonuclear explosions: castro, maestro, and the amrex astrophysics suite. In: Journal of Physics: Conference Series, vol. 1031, p. 012024. IOP Publishing (2018)
Google Scholar

Download references

Acknowledgements

István Reguly was supported by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences. Project no. PD 124905 has been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the PD_17 funding scheme. Supported by the ÚNKP-18-4-PPKE-18 new National Excellence Program of the Ministry of Human Capacities.

Author information

Authors and Affiliations

Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
Istvan Z. Reguly
University of Warwick, Department of Computer Science, Coventry, UK
Istvan Z. Reguly & Gihan R. Mudalige
Numerical Algorithms Group Ltd., Oxford, UK
Branden Moore, Tim Schmielau & Jacques du Toit

Authors

Istvan Z. Reguly
View author publications
You can also search for this author in PubMed Google Scholar
Branden Moore
View author publications
You can also search for this author in PubMed Google Scholar
Tim Schmielau
View author publications
You can also search for this author in PubMed Google Scholar
Jacques du Toit
View author publications
You can also search for this author in PubMed Google Scholar
Gihan R. Mudalige
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Istvan Z. Reguly .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, UK
Michèle Weiland
Helmholtz-Zentrum Dresden-Rossendorf, Dresden, Sachsen, Germany
Guido Juckeland
Swiss National Supercomputing Centre, Lugano, Ticino, Switzerland
Sadaf Alam
University of Tennessee at Knoxville, Knoxville, TN, USA
Heike Jagode

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Reguly, I.Z., Moore, B., Schmielau, T., du Toit, J., Mudalige, G.R. (2019). Batch Solution of Small PDEs with the OPS DSL. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-34356-9_12
Published: 03 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34355-2
Online ISBN: 978-3-030-34356-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics