Abstract
In this paper we discuss the challenges and optimisations opportunities when solving a large number of small, equally sized discretised PDEs on regular grids. We present an extension of the OPS (Oxford Parallel library for Structured meshes) embedded Domain Specific Language, and show how support can be added for solving multiple systems, and how OPS makes it easy to deploy a variety of transformations and optimisations. The new capabilities in OPS allow to automatically apply data structure transformations, as well as execution schedule transformations to deliver high performance on a variety of hardware platforms. We evaluate our work on an industrially representative finance simulation on Intel CPUs, as well as NVIDIA GPUs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
OPS Library (2014). https://github.com/OP-DSL/OPS
Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: SC’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE (2012)
Carter Edwards, H., Trott, C.R., Sunderland, D.: Kokkos. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014). https://doi.org/10.1016/j.jpdc.2014.07.003
Chandra, R., Dagum, L., Kohr, D., Menon, R., Maydan, D., McDonald, J.: Parallel Programming in OpenMP. Morgan Kaufmann, San Francisco (2001)
Deakin, T., Price, J., Martineau, M., McIntosh-Smith, S.: Evaluating attainable memory bandwidth of parallel programming models via babelstream. Int. J. Comput. Sci. Eng. 17(3), 247–262 (2018)
Gropp, W., Thakur, R., Lusk, E.: Using MPI-2: Advanced Features of the Message Passing Interface. MIT press, Cambridge (1999)
Hornung, R.D., Keasler, J.A.: The RAJA portability layer: Overview and status. Technical report, Lawrence Livermore National Lab. (LLNL) (9 2014). https://doi.org/10.2172/1169830
Hundsdorfer, W.: Accuracy and stability of splitting with stabilizing corrections. Appl. Numer. Math. 42(1–3), 213–233 (2002)
In’t Hout, K., Welfert, B.: Stability of adi schemes applied to convection-diffusion equations with mixed derivative terms. Appl. Numer. Math. 57(1), 19–35 (2007)
In’t Hout, K., Welfert, B.: Unconditional stability of second-order adi schemes applied to multi-dimensional diffusion equations with mixed derivative terms. Appl. Numer. Math. 59(3–4), 677–692 (2009)
Jammy, S.P., Mudalige, G.R., Reguly, I.Z., Sandham, N.D., Giles, M.: Block-structured compressible navier-stokes solution using the ops high-level abstraction. Int. J. Comput. Fluid Dyn. 30(6), 450–454 (2016). https://doi.org/10.1080/10618562.2016.1243663
Kronawitter, S., Kuckuk, S., Köstler, H., Lengauer, C.: Automatic data layout transformations in the exastencils code generator. Mod. Phys. Lett. A 28(03), 1850009 (2018)
László, E., Giles, M., Appleyard, J.: Manycore algorithms for batch scalar and block tridiagonal solvers. ACM Trans. Math. Softw. 42(4), 31:1–31:36 (2016). https://doi.org/10.1145/2830568. http://doi.acm.org/10.1145/2830568
MacNeice, P., Olson, K.M., Mobarry, C., De Fainchtein, R., Packer, C.: Paramesh: a parallel adaptive mesh refinement community toolkit. Comput. Phys. Commun. 126(3), 330–354 (2000)
Mudalige, G.R., Reguly, I.Z., Giles, M.B., Mallinson, A.C., Gaudin, W.P., Herdman, J.A.: Performance analysis of a high-level abstractions-based hydrocode on future computing systems. In: Jarvis, S.A., Wright, S.A., Hammond, S.D. (eds.) PMBS 2014. LNCS, vol. 8966, pp. 85–104. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17248-4_5
Nath, R., Tomov, S., Dongarra, J.: An improved magma gemm for fermi graphics processing units. Int. J. High Perform. Comput. Appl. 24(4), 511–515 (2010)
Nvidia, C.: Programming guide (2010)
Reguly, I.Z., Mudalige, G.R., Giles, M.B.: Loop tiling in large-scale stencil codes at run-time with OPS. IEEE Trans. Parallel Distrib. Syst. 29(4), 873–886 (2018). https://doi.org/10.1109/TPDS.2017.2778161
Reguly, I.Z., Mudalige, G.R., Giles, M.B., Curran, D., McIntosh-Smith, S.: The ops domain specific abstraction for multi-block structured grid computations. In: 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing, pp. 58–67, November 2014. https://doi.org/10.1109/WOLFHPC.2014.7
Siklosi, B., Reguly, I.Z., Mudalige, G.R.: Heterogeneous cpu-gpu execution of stencil applications. In: 2018 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pp. 71–80, November 2018. https://doi.org/10.1109/P3HPC.2018.00010
Stone, J.E., Gohara, D., Shi, G.: Opencl: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12(3), 66 (2010)
Tataru, G., Fisher, T.: Stochastic local volatility. Quantitative Development Group, Bloomberg Version 1(February 5) (2010)
Verwer, J.G., Spee, E.J., Blom, J.G., Hundsdorfer, W.: A second-order rosenbrock method applied to photochemical dispersion problems. SIAM J. Sci. Comput. 20(4), 1456–1480 (1999)
Wang, H.: A parallel method for tridiagonal equations. ACM Trans. Math. Software (TOMS) 7(2), 170–183 (1981)
Wyns, M., Du Toit, J.: A finite volume-alternating direction implicit approach for the calibration of stochastic local volatility models. Int. J. Comput. Math. 94(11), 2239–2267 (2017)
Zingale, M., et al.: Meeting the challenges of modeling astrophysical thermonuclear explosions: castro, maestro, and the amrex astrophysics suite. In: Journal of Physics: Conference Series, vol. 1031, p. 012024. IOP Publishing (2018)
Acknowledgements
István Reguly was supported by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences. Project no. PD 124905 has been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the PD_17 funding scheme. Supported by the ÚNKP-18-4-PPKE-18 new National Excellence Program of the Ministry of Human Capacities.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Reguly, I.Z., Moore, B., Schmielau, T., du Toit, J., Mudalige, G.R. (2019). Batch Solution of Small PDEs with the OPS DSL. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-34356-9_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34355-2
Online ISBN: 978-3-030-34356-9
eBook Packages: Computer ScienceComputer Science (R0)