Skip to main content

A Framework for the Design and Reuse of Grid Workflows

  • Conference paper
Book cover Scientific Applications of Grid Computing (SAG 2004)

Abstract

Grid workflows can be seen as special scientific workflows involving high performance and/or high throughput computational tasks. Much work in grid workflows has focused on improving application performance through schedulers that optimize the use of computational resources and bandwidth. As high-end computing resources are becoming more of a commodity that is available to new scientific communities, there is an increasing need to also improve the design and reusability “performance” of scientific workflow systems. To this end, we are developing a framework that supports the design and reuse of grid workflows. Individual workflow components (e.g., for data movement, database querying, job scheduling, remote execution etc.) are abstracted into a set of generic, reusable tasks. Instantiations of these common tasks can be functionally equivalent atomic components (called actors) or composite components (so-called composite actors or subworkflows). In this way, a grid workflow designer does not have to commit to a particular Grid technology when developing a scientific workflow; instead different technologies (e.g. GridFTP, SRB, and scp) can be used interchangeably and in concert. We illustrate the application of our framework using two real-world Grid workflows from different scientific domains, i.e., cheminformatics and bioinformatics, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berman, F., Wolski, R., Casanova, H., Cirne, W., Dail, H., Faerman, M., Figueira, S., Hayes, J., Obertelli, G., Schopf, J., Shao, G., Smallen, S., Spring, N., Su, A., Zagorodnov, D.: Adaptive computing on the Grid using AppLeS. IEEE Transactions on Parallel and Distributed Systems 14(4), 369–382 (2003)

    Article  Google Scholar 

  2. Berman, F., Fox, G., Hey, A. (eds.): Grid Computing: Making the Global Infrastructure a Reality. John Wiley & Sons, Chichester (2003)

    Google Scholar 

  3. The Condor Project Homepage: http://www.cs.wisc.edu/condor/

  4. GRIDS: Grid Research Integration Deployment and Support Center, The Grid Ecosystem: Software Components for Grid Systems and Applications: http://www-unix.grids-center.org/r6/ecosystem

  5. The Globus Toolkit: http://www-unix.globus.org/toolkit/

  6. Storage Resource Broker: http://www.npaci.edu/DICE/SRB/

  7. Kepler Project: http://kepler-project.org

  8. Taverna Project: http://taverna.sourceforge.net

  9. Triana Project: http://www.trianacode.org/

  10. Vladimir, S.: Grid Job submission using the Java CoG Kit, IBM Developer Works

    Google Scholar 

  11. Nimrod/G Project: http://www.csse.monash.edu.au/~nimrod/nimrodg/

  12. AppLeS Parameter Sweep Template (APST) Project: http://grail.sdsc.edu/projects/apst/

  13. Configuring Globus Toolkit Logging Services: http://www-unix.globus.org/toolkit/docs/3.2/core/admin/configuringlogging.html

  14. Abramson, D., Giddy, J., Kotler, L.: High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid? In: IPDPS 2000, Mexico, USA, pp. 520–528. IEEE CS Press, Los Alamitos (2000)

    Google Scholar 

  15. Schwiegelshohn, U., Yahyapour, R.: Attributes for Communication Between Scheduling Instances. In: Global Grid Forum (GGF) (December 2001)

    Google Scholar 

  16. Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludäscher, B., Mock, S.: Kepler: An Extensible System for Design and Execution of Scientific Workflows. In: The 16th Intl. Conference on Scientific and Statistical Database Management (SSDBM), Santorini Island, Greece (June 2004)

    Google Scholar 

  17. NSF/ITR: GEON: A Research Project to Create Cyberinfrastructure for the Geosciences, http://www.geongrid.org

  18. NSF/ITR: Enabling the Science Environment for Ecological Knowledge (SEEK), http://seek.ecoinformatics.org

  19. ROADNet: Real-time Observatories, Applications and Data Management Network, http://roadnet.ucsd.edu

  20. Scientific Data Management (SDM) Center, http://sdm.lbl.gov/sdmcenter

  21. Resurgence Project Home Page: http://www.resurgence.unizh.ch/~resurgence/

  22. EOL Project: http://eol.sdsc.edu

  23. Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludäscher, B., Mock, S.: Kepler: Towards a Grid-Enabled System for Scientific Workflows. In: The Workflow in Grid Systems Workshop in GGF10 - The Tenth Global Grid Forum, Berlin, Germany (March 2004)

    Google Scholar 

  24. Lee, E.A., et al.: Ptolemy II project and system, Department of EECS, UC Berkeley, http://ptolemy.eecs.berkeley.edu/ptolemyII

  25. Liu, X., Liu, J., Eker, J., Lee, E.A.: Heterogeneous Modeling and Design of Control Systems. In: Samad, T., Balas, G. (eds.) Software-Enabled Control: Information Technology for Dynamical Systems. Wiley-IEEE Press (April 2003)

    Google Scholar 

  26. Kahn, G.: The Semantics of a Simple Language for Parallel Programming. In: Proceedings of International Federation for Information Processing Congress 74, pp. 471–475. North Holland Publishing Co., Amsterdam (1974)

    Google Scholar 

  27. Lee, E.A., Parks, T.M.: Dataflow Process Networks. Proceedings of the IEEE 83(5), 773–801 (1995)

    Article  Google Scholar 

  28. Hylands, C., Lee, E.A., Liu, J., Liu, X., Neuendorffer, S., Xiong, Y., Zheng, H. (eds.): Heterogeneous Concurrent Modeling and Design in Java, vol. 3: Ptolemy II Domains. Technical Memorandum UCB/ERL M03/29, University of California, Berkeley, CA USA 94720, July 16 (2003)

    Google Scholar 

  29. van Laszewski, G., Amin, K., Hategan, M., Zaluzec, N.J., Hampton, S., Rossi, A.: GridAnt: A Client-Controllable Grid Workflow System. In: 37th Hawaii International Conference on System Sciences (HICSS-37), Hilton Waikoloa Village, Island of Hawaii (January 2004)

    Google Scholar 

  30. Baldridge, K.K., Sudholt, W., Greenberg, J.P., Amoreira, C., Potier, Y., Altintas, I., Birnbaum, A., Abramson, D., Enticott, C., Garic, S.: Cluster and Grid Infrastructure for Computational Chemistry and Biochemistry. In: Zomaya, A.Y. (ed.) Parallel Computing for Bioinformatics. John Wiley & Sons, Chichester (submitted for publication)

    Google Scholar 

  31. van der Aalst, W.M.P., Barros, A.P., ter Hofstede, A.H.M., Kiepuszewski, B.: Advanced Workflow Patterns. In: Scheuermann, P., Etzion, O. (eds.) CoopIS 2000. LNCS, vol. 1901, pp. 18–29. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  32. Altintas, I., Jaeger, E., Lin, K., Ludaescher, B., Memon, A.: A Web Service Composition and Deployment Framework for Scientific Workflows. In: The 2nd Intl. Conference on Web Services (ICWS), San Diego, California (July 2004)

    Google Scholar 

  33. Ludaescher, B., Altintas, I., Berkely, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific Workflow Management and the KEPLER System. Special issue of Distributed and Parallel Systems (2005) (to appear)

    Google Scholar 

  34. Baldridge, K.K., Greenberg, J.P., Sudholt, W., Mock, S., Altintas, I., Amoreira, C., Potier, Y., Birnbaum, A., Bhatia, K., Taufer, M.: The Computational Chemistry Prototyping Environment. In: Proceedings of the IEEE Special Issue on Grid Computing (in print)

    Google Scholar 

  35. Sudholt, W., Baldridge, K.K., Abramson, D., Enticott, C., Garic, S., Kondric, C., Nguyen, D.: Application of grid computing to parameter sweeps and optimizations in molecular modeling. Future Generation Computer Systems 21, 27–35 (2005)

    Article  Google Scholar 

  36. Schmidt, M.W., Baldridge, K.K., Boatz, J.A., Elbert, S.T., Gordon, M.S., Jensen, J.H., Koseki, S., Matsunaga, N., Nguyen, K.A., Su, S.J., Windus, T.L., Dupuis, M., Montgomery, J.A.: J. Comput. Chem. 14, 1347–1363 (1993)

    Google Scholar 

  37. GAMESS Home Page: http://www.msg.ameslab.gov/GAMESS/

  38. Open Babel: A Package to Decypher Computational Chemistry: http://openbabel.sourceforge.net/

  39. Ibarra, O.H., Kim, C.E.: Heuristic algorithms for scheduling independent tasks on nonindentical processors. Journal of the ACM 24(2), 280–289 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  40. Birnbaum, A., Hayes, J., Li, W.W., Miller, M.A., Arzberger, P.W., Bourne, P.E., Casanova, H.: To appear in Proceedings of LNCS. LNCS. Springer, Heidelberg (2005)

    Google Scholar 

  41. Bowers, S., Ludäscher, B.: An Ontology-Driven Framework for Data Transformation in Scientific Workflows. In: Rahm, E. (ed.) DILS 2004. LNCS (LNBI), vol. 2994, pp. 1–16. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Altintas, I. et al. (2005). A Framework for the Design and Reuse of Grid Workflows . In: Herrero, P., Pérez, M.S., Robles, V. (eds) Scientific Applications of Grid Computing. SAG 2004. Lecture Notes in Computer Science, vol 3458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11423287_11

Download citation

  • DOI: https://doi.org/10.1007/11423287_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25810-0

  • Online ISBN: 978-3-540-32010-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics