skip to main content
10.1145/1883133.1883138acmotherconferencesArticle/Chapter ViewAbstractPublication PageswewstConference Proceedingsconference-collections
research-article

Shepherd: node monitors for fault-tolerant distributed process execution in OSIRIS

Published:01 December 2010Publication History

ABSTRACT

OSIRIS is a middleware for the composition and orchestration of distributed web services that follows a P2P decentralized approach to process execution, providing already some degree of resilience to faults and high performance in large-scale computational clusters. In this paper, we present on-going work aimed at improving OSIRIS' fault tolerance capabilities. We introduce in OSIRIS new architectural elements for the maintenance of a virtual stable storage and the monitoring of activities of service instances, together with algorithms that allow execution to survive also failures that the system is currently not able to cope with.

References

  1. Boualem Benatallah, Marlon Dumas, and Quan Z. Sheng. Facilitating the rapid development and scalable orchestration of composite web services. Distrib. Parallel Databases, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Philip Bernstein and Eric Newcomer. Principles of Transaction Processing. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2009.Google ScholarGoogle Scholar
  3. Erik Elmroth, Francisco Hernandez, and Johan Tordsson. A Light-Weight Grid Workflow Execution Engine Enabling Client and Middleware Independence. In Roman Wyrzykowski, Jack Dongarra, Konrad Karczewski, and Jerzy Wasniewski, editors, Parallel Processing and Applied Mathematics, volume 4967 of Lecture Notes in Computer Science, pages 754--761. Springer Berlin, Heidelberg, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Wolfgang Emmerich, Ben Butchart, Liang Chen, Bruno Wassermann, and Sarah Price. Grid service orchestration using the business process execution language (bpel). Journal of Grid Computing, 3:283--304, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  5. Ali Ghodsi, Luc Onana Alima, and Seif Haridi. Symmetric replication for structured peer-to-peer systems. In DBISP2P'05/06: Proceedings of the 2005/2006 international conference on Databases, information systems, and peer-to-peer computing, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jim Gray and Leslie Lamport. Consensus on transaction commit. ACM Trans. Database Syst., 31(1):133--160, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Boris Mejías, Mikael Högqvist, and Peter Van Roy. Visualizing Transactional Algorithms for DHTs. In P2P '08: Proceedings of the 2008 Eighth International Conference on Peer-to-Peer Computing, pages 79--80, Washington, DC, USA, 2008. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Boris Mejías and Peter Van Roy. The relaxed-ring: A fault-tolerant topology for structured overlay networks, 2008.Google ScholarGoogle Scholar
  9. Boris Mejías and Peter Van Roy. Beernet: Building Self-Managing Decentralized Systems with Replicated Transactional Storage. In International Journal of Adaptivee, Resilient and Autonomc Systems (IJARAS), 1:1--24, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Monika Moser and Seif Haridi. Atomic Commitment in Transactional DHTs. In Proc. of the CoreGRID Symposium CoreGRID, 2007.Google ScholarGoogle Scholar
  11. Peter Muth, Dirk Wodtke, Jeanine Weissenfels, Angelika Kotz Dittrich, and Gerhard Weikum. From centralized workflow specification to distributed workflowexecution. J. Intell. Inf. Syst., 10(2):159--184, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Athicha Muthitacharoen, Athicha Muthitacharoen, Seth Gilbert, Seth Gilbert, Robert Morris, and Robert Morris. Etna: a fault-tolerant algorithm for atomic mutable dht data. Technical report, Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory, 2005.Google ScholarGoogle Scholar
  13. Stefan Pleisch and André Schiper. Fault-tolerant mobile agent execution. IEEE Trans. Comput., 52(2):209--222, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Wenyu Qu, Hong Shen, and Xavier Defago. A survey of mobile agent-based fault-tolerant technology. Parallel and Distributed Computing Applications and Technologies, International Conference on, 0:446--450, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Luís Rodrigues Rachid Guerraoui. Introduction to to Reliable Distributed Programming. Springer Publishers, Berlin, Germany, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Christoph Schuler, Roger Weber, Heiko Schuldt, and Hans-J. Schek. Peer-to-Peer Process Execution with Osiris. In Proceedings of the 1 st International Conference on Service-Oriented Computing, 2003.Google ScholarGoogle Scholar
  17. Christoph Schuler, Roger Weber, Heiko Schuldt, and Hans-J. Schek. Scalable peer-to-peer process management - The OSIRIS approach. In Proceedings of the 2 nd International Conference on Web Services (ICWS'2004), pages 26--34. IEEE Computer Society, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Tallat M. Shafaat, Monika Moser, Thorsten Schütt, Alexander Reinefeld, Ali Ghodsi, and Seif Haridi. Key-based consistency and availability in structured overlay networks. In InfoScale '08: Proceedings of the 3rd international conference on Scalable information systems, pages 1--5, ICST, Brussels, Belgium, Belgium, 2008. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In SIGCOMM '01: Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications, pages 149--160, New York, NY, USA, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Xinfeng Ye. Towards a Reliable Distributed Web Service Execution Engine. In ICWS '06: Proceedings of the IEEE International Conference on Web Services, pages 595--602, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ustun Yildiz and Claude Godart. Synchronization Solutions for Decentralized Service Orchestrations. In ICIW '07: Proceedings of the Second International Conference on Internet and Web Applications and Services, page 39, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Weihai Yu. Decentralized Orchestration of BPEL Processes with Execution Consistency. In APWeb/WAIM '09: Proceedings of the Joint International Conferences on Advances in Data and Web Management, pages 665--670, Berlin, Heidelberg, 2009. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Weihai Yu. Scalable Services Orchestration with Continuation-Passing Messaging. In INTENSIVE '09: Proceedings of the 2009 First International Conference on Intensive Applications and Services, pages 59--64, Washington, DC, USA, 2009. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Weihai Yu and Jie Yang. Continuation-Passing Enactment of Distributed Recoverable Workflows. In SAC '07: Proceedings of the 2007 ACM symposium on Applied computing, pages 475--481, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Shepherd: node monitors for fault-tolerant distributed process execution in OSIRIS

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WEWST '10: Proceedings of the 5th International Workshop on Enhanced Web Service Technologies
      December 2010
      48 pages
      ISBN:9781450302388
      DOI:10.1145/1883133

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 December 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WEWST '10 Paper Acceptance Rate5of13submissions,38%Overall Acceptance Rate5of13submissions,38%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader