Skip to main content
Log in

Optimizing Grid-Based Workflow Execution

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Large-scale applications can be expressed as a set of tasks with data dependencies between them, also known as application workflows. Due to the scale and data processing requirements of these applications, they require Grid computing and storage resources. So far, the focus has been on developing easy to use interfaces for composing these workflows and finding an optimal mapping of tasks in the workflow to the Grid resources in order to minimize the completion time of the application. After this mapping is done, a workflow execution engine is required to run the workflow over the mapped resources. In this paper, we show that the performance of the workflow execution engine in executing the workflow can also be a critical factor in determining the workflow completion time. Using Condor as the workflow execution engine, we examine the various factors that affect the completion time of a fine granularity astronomy workflow. We show that changing the system parameters that influence these factors and restructuring the workflow can drastically reduce the completion time of this class of workflows. We also examine the effect on the optimizations developed for the astronomy application on a coarser granularity biology application. We were able to reduce the completion time of the Montage and the Tomography application workflows by 90% and 50%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. E. Deelman, et al., “GriPhyN and LIGO, Building a virtual data grid for gravitational wave scientists,” in 11th Intl Symposium on High Performance Distributed Computing, 2002.

  2. B. Berriman, et al., “Montage: A Grid-enabled image mosaic service for the NVO,” in Astronomical Data Analysis Software and Systems (ADASS) XIII, 2003.

  3. E. Deelman, et al., “Grid-based galaxy morphology analysis for the national virtual observatory,” in SC, 2003.

  4. E. Deelman, et al., “Pegasus: Mapping scientific workflows onto the grid,” in 2nd EUROPEAN ACROSS GRIDS CONFERENCE, 2004, Nicosia, Cyprus.

  5. C. Kesselman, I. Foster and T. Prudhome, “Distributed telepressence: The NEESgrid earthquake engineering collaboratory,” in The Grid: A Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman, (eds.), 2004, Morgan Kaufmann.

  6. W. Du and G. Agrawal, “Filter decomposition for supporting coarse-grained pipelined parallelism,” in Parallel Processing, 2005. ICPP 2005. International Conference on, 2005.

  7. M. Iverson and F. Ozguner, “Dynamic, competitive scheduling of multiple DAGs in a distributed heterogeneous environment,” in Heterogeneous Computing Workshop, 1998. (HCW 98) Proceedings, 1998 Seventh, 1998.

  8. M. Maheswaran and H.J. Siegel, “A dynamic matching and scheduling algorithm for heterogeneous computing systems, in Heterogeneous Computing Workshop, 1998. (HCW 98) Proceedings, 1998 Seventh, 1998.

  9. P. Markenscoff and Y.Y. Li., “Scheduling a computational dag on a parallel system with communication delays and replication of node execution,” in Parallel Processing Symposium, 1993., Proceedings of Seventh International, 1993.

  10. J. Frey, et al., “Condor-G: A Computation management agent for multi-institutional grids,” in 10th International Symposium on High Performance Distributed Computing, IEEE Press, 2001.

  11. DAGMan, http://www.cs.wisc.edu/condor/dagman.

  12. M. Litzkow, M. Livny and M. Mutka, “Condor – A hunter of idle workstations,” in Proc. 8th Intl Conf. on Distributed Computing Systems, 1988, pp. 104–111.

  13. K. Czajkowski, et al.,“Aresourcemanagement architecture for metacomputing systems,” in 4th Workshop on Job Scheduling Strategies for Parallel Processing, 1998, Springer, pp. 62–82.

  14. Condor_Glidein, http://www.cs.wisc.edu/condor/glidein.

  15. D.S. Katz, et al., “A comparison of two methods for building astronomical image mosaics on a grid,” in Parallel Processing, 2005. ICPP 2005 Workshops. International Conference Workshops on, 2005.

  16. C. Catlett, The philosophy of TeraGrid: Building an open, extensible, distributed TeraScale facility. in Cluster Computing and the Grid 2nd IEEE/ACM International Symposium CCGRID2002, 2002.

  17. Y.-K. Kwok and I. Ahmad, “Static scheduling algorithms for allocating directed task graphs to multiprocessors,” ACM Comput. Surv., 31 (4) (1999) 406–471.

    Article  Google Scholar 

  18. C.D. Polychronopoulos, “The hierarchical task graph and its use in auto-scheduling,” in Proceedings of the 5th international conference on Supercomputing 1991, ACM: Cologne, West Germany pp. 252–263.

  19. G. Singh, C. Kesselman and E. Deelman, “Optimizing grid-based workflow execution,” in CS Tech report 05–851. 2005, University of Southern California, available at http://www.cs.usc.edu/Research/TechReports/05–851.pdf.

  20. J. Yu and R. Buyya, “A novel architecture for realizing grid workflow using tuple spaces,” in Grid Computing, 2004. Proceedings. Fifth IEEE/ACM International Workshop on, 2004.

  21. K. Amin, et al., “GridAnt: A client-controllable grid workflow system,” in System Sciences, 2004. Proceedings of the 37th Annual Hawaii International Conference on, 2004.

  22. I. Foster and C. Kesselman, “Globus: A metacomputing infrastructure toolkit,” International Journal of Supercomputer Applications, 11 (2) (1997) 115–128.

    Article  Google Scholar 

  23. J. Cao, et al., “GridFlow: Workflow management for grid computing,” in Cluster Computing and the Grid, 2003. Proceedings. CCGrid 2003. 3rd IEEE/ACM International Symposium on, 2003.

  24. S. Hwang, and C. Kesselman, “Grid workflow: A flexible failure handling framework for the Grid,” in High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on, 2003.

  25. S. Majithia, et al., “Triana: A graphical web service composition and execution toolkit,” in Web Services, 2004. Proceedings. IEEE International Conference on, 2004.

  26. P. Crosby, D. Colling and D. Waters, Efficiency of resource brokering in grids for high-energy physics computing. Nuclear Science, IEEE Transactions on, 2004, 51(3): pp. 884–891.

    Article  Google Scholar 

  27. G.D. Ghare, and S.T. Leutenegger, “Improving small job response time for opportunistic scheduling,” in Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 2000. Proceedings. 8th International Symposium on, 2000.

  28. H. Casanova, et al., “Heuristics for scheduling parameter sweep applications in grid environments,” in Heterogeneous Computing Workshop, 2000. (HCW 2000) Proceedings. 9th, 2000.

  29. H. Aytug, et al., “Executing production schedules in the face of uncertainties: A review and some future directions,” European Journal of Operational Research, 2005, 161(1): pp. 86–110.

    Google Scholar 

  30. J. Kim, M. Spraragen and Y. Gil, “An intelligent assistant for interactive workflow composition,” in Intelligent User Interfaces, J. Vanderdonckt, N.J. Nunes, and C. Rich, Editors. 2004, ACM. pp. 125–131.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gurmeet Singh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Singh, G., Kesselman, C. & Deelman, E. Optimizing Grid-Based Workflow Execution. J Grid Computing 3, 201–219 (2005). https://doi.org/10.1007/s10723-005-9011-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-005-9011-7

Key words

Navigation