Cambridge Journals Online

Cambridge Journals Online
Skip to content
Journal of Functional Programming (2005), 15: 615-650 Cambridge University Press
Copyright © 2005 Cambridge University Press
doi:10.1017/S0956796804005489
Published online by Cambridge University Press 12 Jul 2005


Article

A parallel SML compiler based on algorithmic skeletons


NORMAN SCAIFE a1, SUSUMI HORIGUCHI a1, GREG MICHAELSON a2 and PAUL BRISTOW a2
a1 School of Information Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Tatsunokuchi, Nomigun, Ishikawa 923-1292, Japan (e-mail: norman@jaist.ac.jp, hori@jaist.ac.jp)
a2 Department of Computing and Electrical Engineering, Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS, United Kingdom (e-mail: greg@macs.hw.ac.uk, paul@macs.hw.ac.uk)

Article author query
scaife n   [Google Scholar
horiguchi s   [Google Scholar
michaelson g   [Google Scholar
bristow p   [Google Scholar
 

Abstract

Algorithmic skeletons are abstractions from common patterns of parallel activity which offer a high degree of reusability for developers of parallel algorithms. Their close association with higher order functions (HOFs) makes functional languages, with their strong transformational properties, excellent vehicles for skeleton-based parallel program development. However, using HOFs in this way raises substantial problems of identification of useful HOFs within a given application and of resource allocation on target architectures. We present the design and implementation of a parallelising compiler for Standard ML which exploits parallelism in the familiar $map$ and $fold$ HOFs through skeletons for processor farms and processor trees, respectively. The compiler extracts parallelism automatically and is target architecture independant. HOF execution within a functional language can be nested in the sense that one HOF may be passed and evaluated during the execution of another HOF. We are able to exploit this by nesting our parallel skeletons in a processor topology which matches the structure of the Standard ML source. However, where HOF arguments result from partially applied functions, free variable bindings must be identified and communicated through the corresponding skeleton hierarchy to where those arguments are actually applied. We describe the analysis leading from input Standard ML through HOF instantiation and backend compilation to an executable parallel program. We also present an overview of the runtime system and the execution model. Finally, we give parallel performance figures for several example programs, of varying computational loads, on the Linux-based Beowulf, IBM SP/2, Fujitsu AP3000 and Sun StarCat 15000 MIMD parallel machines. These demonstrate good cross-platform consistency of parallel code behaviour.



Cambridge University Press