ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
advertisementadvertisement
Parallel Computing
Volume 29, Issue 1, January 2003, Pages 135-159
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Article
Purchase PDF (608 K)

  E-mail Article   
  Add to my Quick Links   
Bookmark and share in 2collab (opens in new window)
Request permission to reuse this article
  Cited By in Scopus (0)
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1016/S0167-8191(02)00220-X    How to Cite or Link Using DOI (Opens New Window)
Copyright © 2002 Elsevier Science B.V. All rights reserved.

Task scheduling using a block dependency DAG for block-oriented sparse Cholesky factorization*1

Heejo LeeCorresponding Author Contact Information, E-mail The Corresponding Author, a, Jong KimE-mail The Corresponding Author, b, Sung Je HongE-mail The Corresponding Author, b and Sunggu LeeE-mail The Corresponding Author, c

a Ahnlab, Inc., 8F V-Valley Bldg., 724 Suseo-dong, Gangnam-gu, Seoul 135-744, South Korea b Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang 790-784, South Korea c Department of Electrical Engineering, Pohang University of Science and Technology, Pohang 790-784, South Korea

Received 11 April 2002; 
accepted 17 September 2002. ;
Available online 19 December 2002.

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Abstract

Block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular subblocks; each block can then be handled as a computational unit in order to increase data reuse in a hierarchical memory system. Also, the factorization method increases the degree of concurrency and reduces the overall communication volume so that it performs more efficiently on a distributed-memory multiprocessor system than the customary column-oriented factorization method. But until now, mapping of blocks to processors has been designed for load balance with restricted communication patterns. In this paper, we represent tasks using a block dependency DAG that represents the execution behavior of block sparse Cholesky factorization in a distributed-memory system. Since the characteristics of tasks for block Cholesky factorization are different from those of the conventional parallel task model, we propose a new task scheduling algorithm using a block dependency DAG. The proposed algorithm consists of two stages: early-start clustering, and affined cluster mapping (ACM). The early-start clustering stage is used to cluster tasks while preserving the earliest start time of a task without limiting parallelism. After task clustering, the ACM stage allocates clusters to processors considering both communication cost and load balance. Experimental results on a Myrinet cluster system show that the proposed task scheduling approach outperforms other processor mapping methods.

Author Keywords: Task scheduling; Parallel sparse matrix factorization; Block-oriented Cholesky factorization; Directed acyclic graph

Article Outline

1. Introduction
2. Block-oriented sparse Cholesky factorization
2.1. Block decomposition
2.2. Block Cholesky factorization
2.3. Block operations
2.4. Required number of block update operations
3. Task model with communication costs
3.1. Task characteristics
3.2. Task graph
3.3. Task execution behavior on previous block mapping methods
3.3.1. 2-D cyclic mapping
3.3.2. Load balanced mapping
4. Task scheduling using a block dependency DAG
4.1. Task scheduling parameters
4.1.1. Work and parents of subtasks
4.1.2. Earliest start time of a task
4.1.3. Earliest completion time of a task
4.1.4. Level of a task
4.2. Early-start clustering
4.3. Affined cluster mapping
4.4. Running trace of the proposed scheduling algorithm
5. Performance comparison
6. Conclusion
Acknowledgements
References




















Parallel Computing
Volume 29, Issue 1, January 2003, Pages 135-159
 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.