ScienceDirect® Home Skip Main Navigation Links
You have guest access to ScienceDirect. Find out more.
 
Home
Browse
My Settings
Alerts
Help
 Quick Search
 Search tips (Opens new window)
    Clear all fields    
 
Font Size: Decrease Font Size  Increase Font Size
 Abstract - selected
Purchase PDF (328 K)

Article Toolbox
 
 
 
Related Articles in ScienceDirect
View More Related Articles
 
View Record in Scopus
 
doi:10.1006/jpdc.1999.1551    
How to Cite or Link Using DOI (Opens New Window)

Copyright © 1999 Academic Press. All rights reserved.

Regular Article

Automatically Partitioning Threads for Multithreaded Architectures

Purchase the full-text article



References and further reading may be available for this article. To view references and further reading you must purchase this article.

Xinan Tang1 and Guang R. Gao

Department of Electrical and Computer Engineering, University of Delaware, Newark, Delaware, 19716, f1


Received 1 August 1998; 
revised 15 March 1999; 
accepted 9 April 1999. ;
Available online 27 March 2002.

Abstract

There is an enormous amount of parallelism exposed to fine-grain multithreaded architectures to cover latencies. It is a demanding task for a multithreading programmer to manage such a degree of parallelism by hand. To use multithreaded architectures efficiently it is essential to have compiler support for automatically partitioning programs into threads. This paper solves a fundamental problem in compiling for multithreaded architectures, automatically partitioning a program into threads. The focus of such partitioning is to overlap the remote communication latency and minimize the total execution time. We first formulate the partitioning problem based on a multithreaded execution cost model. Then, we prove such a formulation is NP-hard. Therefore, we propose two heuristic thread-partitioning methods to solve this problem in practice. The advanced partitioning algorithm is a novel extension of list scheduling, and it takes advantage of the cost model to generate near-optimum partitioning results. The remote-path-based partitioning algorithm is a simplified version of the advanced one but it is easy for compiler implementation. The two partitioning algorithms were implemented respectively in a thread partitioning testbed and a research EARTH-C compiler. The experimental results show that both partitioning algorithms are effective to generate efficient threaded code, and code generated by the compiler is comparable to hand-written code.

Author Keywords: thread partitioning; thread scheduling; multithreaded compilers; multithreaded architectures; parallelizing compilers

1 Corresponding author.

f1 E-mail: tang@eecis.udel.edu, ggao@eecis.udel.edu


 
Home
Browse
My Settings
Alerts
Help
Elsevier.com (Opens new window)
About ScienceDirect  |  Contact Us  |  Information for Advertisers  |  Terms & Conditions  |  Privacy Policy
Copyright © 2008 Elsevier B.V. All rights reserved. ScienceDirect® is a registered trademark of Elsevier B.V.