Abstract
In this work we report on our experiences running OpenMP programs on a commodity cluster of PCs running a software distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS Parallel Benchmarks that have been automatically parallelized for OpenMP. We compare the performance of the OpenMP implementations with that of their message passing counterparts and discuss performance differences.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
C. Amza, A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel. TreadMarks: Shared Memory Computing on Networks of Workstations. IEEE Computer, 29(2):18–28, February 1996.
D. Bailey, J. Barton, T Lasinski, and H. Simon. The NAS Parallel Benchmarks. Technical Report RNR-91-002, NASA Ames Research Center, Moffett Field, CA, 1991.
D. Bailey, T. Harris, W. Saphir, R van der Wijngaart, A. Woo, and M. Yarrow. The NAS Parallel Benchmarks 2.0. Technical Report NAS-95-020, NASA Ames Research Center, Moffett Field, CA, 1995. http://www.nas.nasa.gov/Software/NPB.
Phillip Ezolt. A Study in Malloc: A Case of Excessive Minor Faults. In Proceedings of the 5 th Annual Linux Showcase & Conference, November 5–10, 2001.
H. Harada, Y. Ishikawa, A. Hori, H. Tezuka, S. Sumimoto, and T. Takahashi. Dynamic Home Node Reallocation on Software Distributed Shared Memory. In Proceedings of HPC Asia 2000, Beijing, China, pages 158–163, May 2000.
Y. C. Hu, H. Lu A. L. Cox, and W. Zwaenepoel. OpenMP for Networks of SMPs. In Proceedings of the Thirteenth International Parallel Processing Symposium, pages 302–310, 1999.
C. S. Ierotheou, S. P. Johnson, M. Cross, and P. F. Leggett. Computer Aided Parallelisation Tools (CAPTools)-Conceptual Overview and Performance on the Parallelisation of Structured Mesh Codes. Parallel Computing, 22:163–195, 1996.
H. Jin, M. Frumkin, and J. Yan. The OpenMP Implementations of NAS Parallel Benchmarks and Its Performance. Technical Report NAS-99-011, NAS, 1999.
H. Jin, M. Frumkin, and J. Yan. Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes. In Proceedings of Third International Symposium on High Performance Computing (ISHPC2000), Tokyo, Japan, October 16–18, 2000.
H. Lu, S. Dwarkdadas, A. L. Cox, and W. Zwaenepoel. Quantifying the Performance Differences Between PVM and TreadMarks. Journal of Parallel and Distributed Computation, 43(2):65–78, June 1997.
MPI 1.1 Standard. http://www-unix.mcs.anl.gov/mpi/mpich.
Omni OpenMP and SCASH. http://www.pccluster.org.
OpenMP Fortran Application Program Interface. http://www.openmp.org.
D. Scales, K. Gharachorloo, and A. Aggarwal. Finegran software distributed shared memory on SMP clusters. In Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, pages 125–136, February 1998.
H. Shan and J. Pal Singh. A comparison of MPI, SHMEM, and Cache-Coherent Shared Address Space Programming Models on a Tightly-Coupled Multiprocessor. International Journal of Parallel Programming, 29(3), 2001.
H. Shan and J. Pal Singh. Comparison of Three Programming Models for Adaptive Applications on the Origin 2000. Journal of Parallel and Distributed Computing, 62:241–266, 2002.
R. Stets, S. Dwarkadas, N. Hardavellas, G. Hunt, L. Kontothanassis, S. Parthasarathy, and M. Scott. Cashmere-2L:Software coherent shared memory on a clustered remote write network. In Proceedings of the 16th ACM Symposium on Operating System Principles, pages 170–183, October 1997.
K. Taura, S. Matsuoka, and A. Yonezawa. StackThreads: An abstract machine for scheduling fine-grain threads on stock CPUs. In Proceedings of Workshop on Theory and Practice of Parallel Programming, pages 121–136, 1994.
H. Tezuka, A. Hori, and Y. Ishikawa. Design and Implementation of PM: a Communication Library for Workstation Cluster. In JSPP’96, IPSJ, pages 41–48, June 1996. (In Japanese).
H. Tezuka, A. Hori, and Y. Ishikawa. PM: A High-Performance Communication Library for Multi-user Parallel Environments. Technical Report TR-96015, RWC, November 1996.
H. Tezuka, F. O’Carroll, A. Hori, and Y. Ishikawa. Pin-down Cache: A Virtual Memory Managment Technique for Zero-copy Communication. Technical Report TR 97006, Tsukuba Research Center, Real World Computing Partnership, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hess, M., Jost, G., Müller, M., Rühle, R. (2003). Experiences Using OpenMP Based on Compiler Directed Software DSM on a PC Cluster. In: Voss, M.J. (eds) OpenMP Shared Memory Parallel Programming. WOMPAT 2003. Lecture Notes in Computer Science, vol 2716. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45009-2_17
Download citation
DOI: https://doi.org/10.1007/3-540-45009-2_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40435-4
Online ISBN: 978-3-540-45009-2
eBook Packages: Springer Book Archive