Logo PTI Logo FedCSIS

Proceedings of the 17th Conference on Computer Science and Intelligence Systems

Annals of Computer Science and Information Systems, Volume 30

Automatic code optimization for computing the McCaskill partition functions

, ,

DOI: http://dx.doi.org/10.15439/2022F4

Citation: Proceedings of the 17th Conference on Computer Science and Intelligence Systems, M. Ganzha, L. Maciaszek, M. Paprzycki, D. Ślęzak (eds). ACSIS, Vol. 30, pages 475478 ()

Full text

Abstract. In this paper, we present the application of three automatic source-to-source compilers to code implementing McCaskill's bioinformatics algorithm. It computes propabilities of various substructures for RNA prediction. McCaskill's algorithm is compute and data intensive and it is within dynamic programming. A corresponding programming code exposes non-uniform dependences that complicates tiling of that code. The corresponding code is represented within the polyhedral model. Its optimization is still a challenging task for optimizing compilers employing multi-threaded loop tiling. To generate optimized code, we used the popular PLuTo compiler that finds and applies affine transformations, the TRACO compiler based on calculating the transitive closure of loop dependence graphs, and the newest polyhedral tool DAPT implementing space-time tiling. An experimental study fulfilled on two multi-core machines: an AMD Epyc with 64 threas and a 2x Intel Xeon Platinum 9242 with 192 threads demonstrates considerable speedup, high locality, and scalability for various problem sizes and the number of threads of generated codes by means of space-time tiling.

References

  1. M. Raden, S. M. Ali, O. S. Alkhnbashi, A. Busch, F. Costa, J. A. Davis, F. Eggenhofer, R. Gelhausen, J. Georg, S. Heyne, M. Hiller, K. Kundu, R. Kleinkauf, S. C. Lott, M. M. Mohamed, A. Mattheis, M. Miladi, A. S. Richter, S. Will, J. Wolff, P. R. Wright, and R. Backofen, “Freiburg RNA tools: a central online resource for RNA-focused research and teaching,” Nucleic Acids Research, vol. 46, no. W1, pp. W25–W29, 2018. http://dx.doi.org/10.1093/nar/gky329
  2. M. Raden, S. M. Ali, O. S. Alkhnbashi, A. Busch, F. Costa, J. A. Davis, F. Eggenhofer, R. Gelhausen, J. Georg, S. Heyne et al., “Freiburg rna tools: a central online resource for rna-focused research and teaching,” Nucleic acids research, vol. 46, no. W1, pp. W25–W29, 2018.
  3. U. Bondhugula, V. Bandishti, and I. Pananilath, “Diamond tiling: Tiling techniques to maximize parallelism for stencil computations,” IEEE Transactions on Parallel and Distributed Systems, vol. 28, no. 5, pp. 1285–1298, May 2017. http://dx.doi.org/10.1109/tpds.2016.2615094
  4. U. Bondhugula, A. Acharya, and A. Cohen, “The pluto+ algorithm: A practical approach for parallelization and locality optimization of affine loop nests,” ACM Trans. Program. Lang. Syst., vol. 38, no. 3, pp. 12:1–12:32, Apr. 2016. http://dx.doi.org/10.1145/2896389
  5. U. Bondhugula et al., “A practical automatic polyhedral parallelizer and locality optimizer,” SIGPLAN Not., vol. 43, no. 6, pp. 101–113, Jun. 2008. http://dx.doi.org/10.1145/1379022.1375595 Http://plutocompiler.sourceforge.net/.
  6. J. M. M. Caamaño, A. Sukumaran-Rajam, A. Baloian, M. Selva, and P. Clauss, “Apollo: Automatic speculative polyhedral loop optimizer,” in IMPACT 2017-7th International Workshop on Polyhedral Compilation Techniques, 2017, p. 8.
  7. S. Verdoolaege, J. Carlos Juega, A. Cohen, J. Ignacio Gómez, C. Tenllado, and F. Catthoor, “Polyhedral parallel code generation for cuda,” ACM Trans. Archit. Code Optim., vol. 9, no. 4, jan 2013. http://dx.doi.org/10.1145/2400682.2400713. [Online]. Available: https: //doi.org/10.1145/2400682.2400713
  8. M. M. Baskaran, A. Hartono, S. Tavarageri, T. Henretty, J. Ramanujam, and P. Sadayappan, “Parameterized tiling revisited,” in Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization, ser. CGO ’10. New York, NY, USA: ACM, 2010. ISBN 978-1-60558-635-9 pp. 200–209.
  9. R. Chowdhury, P. Ganapathi, J. J. Tithi, C. Bachmeier, B. C. Kuszmaul, C. E. Leiserson, A. Solar-Lezama, and Y. Tang, “Autogen: Automatic discovery of cache-oblivious parallel recursive algorithms for solving dynamic programs,” ACM SIGPLAN Notices, vol. 51, no. 8, pp. 1–12, 2016.
  10. U. Bondhugula et al., “A practical automatic polyhedral parallelizer and locality optimizer,” SIGPLAN Not., vol. 43, no. 6, pp. 101–113, Jun. 2008. [Online]. Available: http://pluto-compiler.sourceforge.net
  11. W. Pugh and D. Wonnacott, “An exact method for analysis of value-based array data dependences,” in Sixth Annual Workshop on Programming Languages and Compilers for Parallel Computing. Springer-Verlag, 1993.
  12. W.Bielecki and M. Palkowski, “Tiling of arbitrarily nested loops by means of the transitive closure of dependence graphs,” International Journal of Applied Mathematics and Computer Science (AMCS), vol. Vol. 26, no. 4, pp. 919–939, December 2016. http://dx.doi.org/10.1515/amcs-2016-0065
  13. W. Bielecki and M. Palkowski, “Space-time loop tiling for dynamic programming codes,” Electronics, vol. 10, no. 18, p. 2233, 2021.
  14. M. Palkowski and W. Bielecki, “Parallel cache-efficient code for computing the McCaskill partition functions,” vol. 18, pp. 207–210, 2019. http://dx.doi.org/10.15439/2019F8
  15. W. Bielecki and M. Poliwoda, “Automatic parallel tiled code generation based on dependence approximation,” in International Conference on Parallel Computing Technologies. Springer, 2021, pp. 260–275.