Abstract
Grid workflows can be seen as special scientific workflows involving high performance and/or high throughput computational tasks. Much work in grid workflows has focused on improving application performance through schedulers that optimize the use of computational resources and bandwidth. As high-end computing resources are becoming more of a commodity that is available to new scientific communities, there is an increasing need to also improve the design and reusability “performance” of scientific workflow systems. To this end, we are developing a framework that supports the design and reuse of grid workflows. Individual workflow components (e.g., for data movement, database querying, job scheduling, remote execution etc.) are abstracted into a set of generic, reusable tasks. Instantiations of these common tasks can be functionally equivalent atomic components (called actors) or composite components (so-called composite actors or subworkflows). In this way, a grid workflow designer does not have to commit to a particular Grid technology when developing a scientific workflow; instead different technologies (e.g. GridFTP, SRB, and scp) can be used interchangeably and in concert. We illustrate the application of our framework using two real-world Grid workflows from different scientific domains, i.e., cheminformatics and bioinformatics, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berman, F., Wolski, R., Casanova, H., Cirne, W., Dail, H., Faerman, M., Figueira, S., Hayes, J., Obertelli, G., Schopf, J., Shao, G., Smallen, S., Spring, N., Su, A., Zagorodnov, D.: Adaptive computing on the Grid using AppLeS. IEEE Transactions on Parallel and Distributed Systems 14(4), 369–382 (2003)
Berman, F., Fox, G., Hey, A. (eds.): Grid Computing: Making the Global Infrastructure a Reality. John Wiley & Sons, Chichester (2003)
The Condor Project Homepage: http://www.cs.wisc.edu/condor/
GRIDS: Grid Research Integration Deployment and Support Center, The Grid Ecosystem: Software Components for Grid Systems and Applications: http://www-unix.grids-center.org/r6/ecosystem
The Globus Toolkit: http://www-unix.globus.org/toolkit/
Storage Resource Broker: http://www.npaci.edu/DICE/SRB/
Kepler Project: http://kepler-project.org
Taverna Project: http://taverna.sourceforge.net
Triana Project: http://www.trianacode.org/
Vladimir, S.: Grid Job submission using the Java CoG Kit, IBM Developer Works
Nimrod/G Project: http://www.csse.monash.edu.au/~nimrod/nimrodg/
AppLeS Parameter Sweep Template (APST) Project: http://grail.sdsc.edu/projects/apst/
Configuring Globus Toolkit Logging Services: http://www-unix.globus.org/toolkit/docs/3.2/core/admin/configuringlogging.html
Abramson, D., Giddy, J., Kotler, L.: High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid? In: IPDPS 2000, Mexico, USA, pp. 520–528. IEEE CS Press, Los Alamitos (2000)
Schwiegelshohn, U., Yahyapour, R.: Attributes for Communication Between Scheduling Instances. In: Global Grid Forum (GGF) (December 2001)
Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludäscher, B., Mock, S.: Kepler: An Extensible System for Design and Execution of Scientific Workflows. In: The 16th Intl. Conference on Scientific and Statistical Database Management (SSDBM), Santorini Island, Greece (June 2004)
NSF/ITR: GEON: A Research Project to Create Cyberinfrastructure for the Geosciences, http://www.geongrid.org
NSF/ITR: Enabling the Science Environment for Ecological Knowledge (SEEK), http://seek.ecoinformatics.org
ROADNet: Real-time Observatories, Applications and Data Management Network, http://roadnet.ucsd.edu
Scientific Data Management (SDM) Center, http://sdm.lbl.gov/sdmcenter
Resurgence Project Home Page: http://www.resurgence.unizh.ch/~resurgence/
EOL Project: http://eol.sdsc.edu
Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludäscher, B., Mock, S.: Kepler: Towards a Grid-Enabled System for Scientific Workflows. In: The Workflow in Grid Systems Workshop in GGF10 - The Tenth Global Grid Forum, Berlin, Germany (March 2004)
Lee, E.A., et al.: Ptolemy II project and system, Department of EECS, UC Berkeley, http://ptolemy.eecs.berkeley.edu/ptolemyII
Liu, X., Liu, J., Eker, J., Lee, E.A.: Heterogeneous Modeling and Design of Control Systems. In: Samad, T., Balas, G. (eds.) Software-Enabled Control: Information Technology for Dynamical Systems. Wiley-IEEE Press (April 2003)
Kahn, G.: The Semantics of a Simple Language for Parallel Programming. In: Proceedings of International Federation for Information Processing Congress 74, pp. 471–475. North Holland Publishing Co., Amsterdam (1974)
Lee, E.A., Parks, T.M.: Dataflow Process Networks. Proceedings of the IEEE 83(5), 773–801 (1995)
Hylands, C., Lee, E.A., Liu, J., Liu, X., Neuendorffer, S., Xiong, Y., Zheng, H. (eds.): Heterogeneous Concurrent Modeling and Design in Java, vol. 3: Ptolemy II Domains. Technical Memorandum UCB/ERL M03/29, University of California, Berkeley, CA USA 94720, July 16 (2003)
van Laszewski, G., Amin, K., Hategan, M., Zaluzec, N.J., Hampton, S., Rossi, A.: GridAnt: A Client-Controllable Grid Workflow System. In: 37th Hawaii International Conference on System Sciences (HICSS-37), Hilton Waikoloa Village, Island of Hawaii (January 2004)
Baldridge, K.K., Sudholt, W., Greenberg, J.P., Amoreira, C., Potier, Y., Altintas, I., Birnbaum, A., Abramson, D., Enticott, C., Garic, S.: Cluster and Grid Infrastructure for Computational Chemistry and Biochemistry. In: Zomaya, A.Y. (ed.) Parallel Computing for Bioinformatics. John Wiley & Sons, Chichester (submitted for publication)
van der Aalst, W.M.P., Barros, A.P., ter Hofstede, A.H.M., Kiepuszewski, B.: Advanced Workflow Patterns. In: Scheuermann, P., Etzion, O. (eds.) CoopIS 2000. LNCS, vol. 1901, pp. 18–29. Springer, Heidelberg (2000)
Altintas, I., Jaeger, E., Lin, K., Ludaescher, B., Memon, A.: A Web Service Composition and Deployment Framework for Scientific Workflows. In: The 2nd Intl. Conference on Web Services (ICWS), San Diego, California (July 2004)
Ludaescher, B., Altintas, I., Berkely, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific Workflow Management and the KEPLER System. Special issue of Distributed and Parallel Systems (2005) (to appear)
Baldridge, K.K., Greenberg, J.P., Sudholt, W., Mock, S., Altintas, I., Amoreira, C., Potier, Y., Birnbaum, A., Bhatia, K., Taufer, M.: The Computational Chemistry Prototyping Environment. In: Proceedings of the IEEE Special Issue on Grid Computing (in print)
Sudholt, W., Baldridge, K.K., Abramson, D., Enticott, C., Garic, S., Kondric, C., Nguyen, D.: Application of grid computing to parameter sweeps and optimizations in molecular modeling. Future Generation Computer Systems 21, 27–35 (2005)
Schmidt, M.W., Baldridge, K.K., Boatz, J.A., Elbert, S.T., Gordon, M.S., Jensen, J.H., Koseki, S., Matsunaga, N., Nguyen, K.A., Su, S.J., Windus, T.L., Dupuis, M., Montgomery, J.A.: J. Comput. Chem. 14, 1347–1363 (1993)
GAMESS Home Page: http://www.msg.ameslab.gov/GAMESS/
Open Babel: A Package to Decypher Computational Chemistry: http://openbabel.sourceforge.net/
Ibarra, O.H., Kim, C.E.: Heuristic algorithms for scheduling independent tasks on nonindentical processors. Journal of the ACM 24(2), 280–289 (1977)
Birnbaum, A., Hayes, J., Li, W.W., Miller, M.A., Arzberger, P.W., Bourne, P.E., Casanova, H.: To appear in Proceedings of LNCS. LNCS. Springer, Heidelberg (2005)
Bowers, S., Ludäscher, B.: An Ontology-Driven Framework for Data Transformation in Scientific Workflows. In: Rahm, E. (ed.) DILS 2004. LNCS (LNBI), vol. 2994, pp. 1–16. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Altintas, I. et al. (2005). A Framework for the Design and Reuse of Grid Workflows . In: Herrero, P., Pérez, M.S., Robles, V. (eds) Scientific Applications of Grid Computing. SAG 2004. Lecture Notes in Computer Science, vol 3458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11423287_11
Download citation
DOI: https://doi.org/10.1007/11423287_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25810-0
Online ISBN: 978-3-540-32010-4
eBook Packages: Computer ScienceComputer Science (R0)