Copyright © 2003 Elsevier B.V. All rights reserved.
Incorporating memory layout in the modeling of message passing programs*1
Available online 10 July 2003.
Abstract
One of the most fundamental tasks any automatic parallelization and optimization tool is confronted with is to find an optimal domain decomposition for an application at hand. For regular domain problems (such as simple matrix manipulations) this task may seem trivial. However, communication costs in message passing programs often significantly depend on the capabilities and particular behavior of the applied communication primitives. As a consequence, straightforward domain decompositions may deliver non-optimal performance.
In this paper we introduce a new point-to-point communication model (called P-3PC, or the ‘Parameterized model based on the Three Paths of Communication’) that is specifically designed to overcome this problem. In comparison with related models (e.g., LogGP) P-3PC is similar in complexity, but more accurate in many situations. Although the model is aimed at MPI’s standard point-to-point operations, it is applicable to similar message passing definitions as well.
The effectiveness of the model is tested in a framework for automatic parallelization of image processing applications. Experiments are performed on two Beowulf-type commodity clusters, each having a different interconnection network, and a different MPI implementation. Results show that, where other models frequently fail, P-3PC correctly predicts the communication costs related to any type of domain decomposition.
Author Keywords: MPI; Performance modeling; Automatic domain decomposition
Article Outline
*1 Based on “P-3PC: A Point-to-Point Communication Model for Automatic and Optimal Decomposition of Regular Domain Problems” by F.J. Seinstra and D. Koelma, which appeared in IEEE Transactions on Parallel and Distributed Systems 13 (7):758–768, July 2002. © 2002 IEEE.






E-mail Article
Add to my Quick Links

Cited By in Scopus (0)







