Abstract
To improve effective performance and usability of shared memory multiprocessor systems, a multi-grain compilation scheme, which hierarchically exploits coarse grain parallelism among loops, subroutines and basic blocks, conventional loop parallelism and near fine grain parallelism among statements inside a basic block, is important. In order to efficiently use hierarchical parallelism of each nest level, or layer, in multigrain parallel processing, it is required to determine how many processors or groups of processors should be assigned to each layer, according to the parallelism of the layer. This paper proposes an automatic hierarchical parallelism control scheme to assign suitable number of processors to each layer so that the parallelism of each hierarchy can be used efficiently. Performance of the proposed scheme is evaluated on IBM RS6000 SMP server with 8 processors using 8 programs of SPEC95FP.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wolfe, M.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading (1996)
Banerjee, U.: Loop parallelization. Kluwer Academic Pub., Dordrecht (1994)
Polaris, http://polaris.cs.uiuc.edu/polaris/
Eigenmann, R., Hoeflinger, J., Padua, D.: On the automatic parallelization of the perfect benchmarks. IEEE Trans. on parallel and distributed systems 9 (1998)
Rauchwerger, L., Amato, N.M., Padua, D.A.: Run-time methods for parallelizing partially parallel loops. In: Proceedings of the 9th ACM International Conference on Supercomputing, Barcelona, Spain, pp. 137–146 (1995)
Tu, P., Padua, D.: Automatic array privatization. In: Proc. 6th Annual Workshop on Languages and Compilers for Parallel Computing (1993)
Hall, M.W., Murphy, B.R., Amarasinghe, S.P., Liao, S., Lam, M.S.: Interprocedural parallelization analysis: A case study. In: Huang, C.-H., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D.A. (eds.) LCPC 1995. LNCS, vol. 1033. Springer, Heidelberg (1996)
Hall, M.W., Anderson, J.M., Amarasinghe, S.P., Murphy, B.R., Liao, S.W., Bugnion, E., Lam, M.S.: Maximizing multiprocessor performance with the suif compiler. IEEE Computer (1996)
Amarasinghe, S., Anderson, J., Lam, M., Tseng, C.: The suif compiler for scalable parallel machines. In: Proc. of the 7th SIAM conference on parallel processing for scientific computing (1995)
Lam, M.S.: Locallity optimizations for parallel machines. In: Third Joint International Conference on Vector and Parallel Processing (1994)
Lim, A.W., Lam., M.S.: Cache optimizations with affine partitioning. In: Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing (2001)
Yoshida, A., Koshizuka, K., Okamoto, M., Kasahara, H.: A data-localization scheme among loops for each layer in hierarchical coarse grain parallel processing. Trans. of IPSJ 40 (1999) (Japanese)
Rivera, G., Tseng, C.W.: Locality optimizations for multi-level caches. In: Super Computing 1999 (1999)
Han, H., Rivera, G., Tseng, C.W.: Software support for improving locality in scientific codes. In: 8th Workshop on Compilers for Parallel Computers, CPC 2000 (2000)
Kasahara, H., Honda, H., Mogi, A., Ogura, A., Fujiwara, K., Narita, S.: A multigrain parallelizing compilation scheme on oscar. In: Proc. 4th Workshop on Languages and Compilers for Parallel Computing (1991)
Okamoto, M., Aida, K., Miyazawa, M., Honda, H., Kasahara, H.: A hierarchical macro-dataflow computation scheme of oscar multi-grain compiler. Trans. of IPSJ 35, 513–521 (1994) (Japanese)
Kasahara, H., Okamoto, M., Yoshida, A., Ogata, W., Kimura, K., Matsui, G., Matsuzaki, H., Honda, H.: Oscar multi-grain architecture and its evaluation. In: Proc. International Workshop on Innovative Architecture for Future Generation High- Performance Processors and Systems (1997)
Kasahara, H., Honda, H., Iwata, M., Hirota, M.: A macro-dataflow compilation scheme for hierarchical multiprocessor systems. In: Proc. Int’l. Conf. on Parallel Processing (1990)
Honda, H., Iwata, M., Kasahara, H.: Coarse grain parallelism detection scheme of fortran programs. Trans. IEICE J73-D-I (1990) (in Japanese)
Kasahara, H.: Parallel Processing Technology. Corona Publishing, Tokyo (1991) (in Japanese)
Kasahara, H., Obata, M., Ishizaka, K.: Automatic coarse grain task parallel processing on smp using openmp. In: Midkiff, S.P., Moreira, J.E., Gupta, M., Chatterjee, S., Ferrante, J., Prins, J.F., Pugh, B., Tseng, C.-W. (eds.) LCPC 2000. LNCS, vol. 2017, p. 189. Springer, Heidelberg (2001)
Kasahara, H., Honda, H., Narita, S.: Parallel processing of near fine grain tasks using static scheduling on oscar. In: Proc. IEEE ACM Supercomputing 1990 (1990)
Kimura, K., Kato, T., Kasahara, H.: Evaluation of processor core architecture for single chip multiprocessor with near fine grain parallel processing. Trans. of IPSJ 42 (2001) (Japanese)
Martorell, X., Ayguade, E., Navarro, N., Corbalan, J., Gozalez, M., Labarta, J.: Thread fork/join techniques for multi-level parllelism exploitation in numa multiprocessors. In: ICS 1999, Rhodes, Greece (1999)
Ayguade, E., Martorell, X., Labarta, J., Gonzalez, M., Navarro, N.: Exploiting multiple levels of parallelism in openmp: A case study. In: ICPP 1999 (1999)
PROMIS, http://www.csrd.uiuc.edu/promis/
Brownhill, C.J., Nicolau, A., Novack, S., Polychronopoulos, C.D.: Achieving multilevel parallelization. In: Araki, K., Joe, K., Polychronopoulos, C.D. (eds.) ISHPC 1997. LNCS, vol. 1336. Springer, Heidelberg (1997)
Parafrase2, http://www.csrd.uiuc.edu/parafrase2/
Girkar, M., Polychronopoulos, C.: Optimization of data/control conditions in task graphs. In: Proc. 4th Workshop on Languages and Compilers for Parallel Computing (1991)
Haghighat, M.R., Polychronopoulos, C.D.: Symbolic Analysis for Parallelizing Compliers. Kluwer Academic Publishers, Dordrecht (1995)
Kasahara, H., Obata, M., Ishizaka, K.: Coarse grain task parallel processing on a shared memory multiprocessor system. Trans. of IPSJ 42 (2001) (Japanese)
Obata, M., Ishizaka, K., Kasahara, H.: Automatic coarse grain task parallel processing using oscar multigrain parallelizing compiler. In: Ninth International Workshop on Compilers for Parallel Computers, CPC 2001 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Obata, M., Shirako, J., Kaminaga, H., Ishizaka, K., Kasahara, H. (2005). Hierarchical Parallelism Control for Multigrain Parallel Processing. In: Pugh, B., Tseng, CW. (eds) Languages and Compilers for Parallel Computing. LCPC 2002. Lecture Notes in Computer Science, vol 2481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11596110_3
Download citation
DOI: https://doi.org/10.1007/11596110_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30781-5
Online ISBN: 978-3-540-31612-1
eBook Packages: Computer ScienceComputer Science (R0)