ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
Performance Evaluation
Volume 60, Issues 1-4, May 2005, Pages 73-105
Performance Modeling and Evaluation of High-Performance Parallel and Distributed Systems
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (600 K)

Article Toolbox
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/j.peva.2004.10.017    
How to Cite or Link Using DOI (Opens New Window)

Copyright © 2004 Elsevier B.V. All rights reserved.

A methodology for detailed performance modeling of reduction computations on SMP machines

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Ruoming JinE-mail The Corresponding Author and Gagan AgrawalCorresponding Author Contact Information, E-mail The Corresponding Author

Department of Computer and Information Sciences, Ohio State University, Columbus, OH 43210, USA


Available online 18 December 2004.

Abstract

In this paper, we revisit the problem of performance prediction on SMP machines, motivated by the need for selecting parallelization strategy for random write reductions. Such reductions frequently arise in data mining algorithms.

In our previous work, we have developed a number of techniques for parallelizing this class of reductions. Our previous work has shown that each of the three techniques, full replication, optimized full locking, and cache-sensitive, can outperform others depending upon problem, dataset, and machine parameters. Therefore, an important question is, “Can we predict the performance of these techniques for a given problem, dataset, and machine?”.

This paper addresses this question by developing an analytical performance model that captures a two-level cache, coherence cache misses, TLB misses, locking overheads, and contention for memory. Analytical model is combined with results from micro-benchmarking to predict performance on real machines. We have validated our model on two different SMP machines. Our results show that our model effectively captures the impact of memory hierarchy (two-level cache and TLB) as well as the factors that limit parallelism (contention for locks, memory contention, and coherence cache misses). The difference between predicted and measured performance is within 20% in almost all cases. Moreover, the model is quite accurate in predicting the relative performance of the three parallelization techniques.

Keywords: Parallel processing; Shared memory; Memory hierarchy; Data mining

Article Outline

1. Introduction
1.1. Random write reductions
1.2. Contributions and organization
2. Problem statement and motivation
2.1. Random write reductions
2.2. Data mining algorithms
2.2.1. A priori association mining
2.2.2. k-means clustering
2.2.3. k-nearest neighbors
2.2.4. Artificial neural networks
3. Parallelization techniques
3.1. Full replication
3.2. Full locking
3.3. Optimized full locking
3.4. Cache-sensitive locking
4. Analytical model
4.1. Cost of waiting
4.2. Cost of cache misses
4.2.1. Capacity and conflict misses
4.2.2. Coherence cache misses
4.3. Cost of TLB misses
4.4. Cost of memory contention
5. Experimental results
5.1. Experimental platforms and experimental design
5.2. Micro-benchmarking
5.3. Results on the large SMP machine
5.4. Results on the small SMP machine
6. Related work
7. Conclusions and future work
References
Vitae





















Corresponding Author Contact InformationCorresponding author.

Performance Evaluation
Volume 60, Issues 1-4, May 2005, Pages 73-105
Performance Modeling and Evaluation of High-Performance Parallel and Distributed Systems
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.