ABSTRACT
Fault tolerance is an important property of large-scale multi-agent systems as the failure rate grows with both the number of the hosts and deployed agents, and the duration of computation. Several approaches have been introduced to deal with some aspects of the fault-tolerance problem. However, most existing solutions are ad hoc. Thus, no existing multi-agent architecture or platform provides a fault-tolerance service that can be used to facilitate the design and implementation of reliable multi-agent systems. So, we have developed a fault-tolerant multi-agent platform (named DimaX). DimaX deals with fail-stop failures like bugs and/or breakdown machines. It brings fault-tolerance for multi-agent applications by using replication techniques. It is based on a replication framework (named DARX).
- M. Bertier, O. Marin, and P. Sens. Performance analysis of a hierarchical failure detector. In Proceedings of the International Conference on Dependable Systems and Networks (DSN'2003), pages 635--644, San Francisco, USA, June 2003.Google ScholarCross Ref
- A. Fedoruk and R. Deters. Improving fault-tolerance in mas with dynamic proxy replicate groups. In IAT, pages 364--370, 2003. Google ScholarDigital Library
- FIPA Foundation for Intelligent Physical Agents. Fipa acl message structure specification.Google Scholar
- R. Guerraoui and A. Schiper. Software-based replication for fault-tolerance. IEEE Computer, 30(3):68--74, 1997. Google ScholarDigital Library
- Z. Guessoum and J. P. Briot. From active object to autonomous agents. IEEE Concurrency, 7(3):68--78, 1999. Google ScholarDigital Library
- Z. Guessoum, N. Faci, and J. P. Briot. Adaptive replication of large scale mass: Towards a fault-tolerant multiagent platform. In Springer Verlag, 2006.Google Scholar
- S. Hagg. A sentinel approach to fault handling in multi-agent systems. volume 1286 of LNCS, pages 190--195. Springer-Verlag, 1997. Google ScholarDigital Library
- A. Helsinger, M. Thome, and T. Wright. Cougaar: a scalable, distributed multi-agent architecture. In SMC (2), pages 1910--1917, 2004.Google Scholar
- M. Hsueh, T.K. Tsai, and R.K. Iyer. Fault injection techniques and tools. IEEE Computer, 30(4):75--82, 1997. Google ScholarDigital Library
- G.A. Kaminka, D.V. Pynadah, and M. Tambe. Monitoring teams by overhearing: A multi-agent plan-recognition approach. Journal of Intelligence Artificial Research, 17(1):83--135, 2002. Google ScholarDigital Library
- M. Klein, J. Rodriguez-Aguilar, and C. Dellarocas. Using domain-independent exception handling services to enable robust open multi-agent systems: the case of agent death. Journal of Autonomous Agents and Multi-Agent Systems, 7(1-2):179--189, 2003. Google ScholarDigital Library
- S. Kumar and P. R. Cohen. Towards a fault-tolerant multiagent system architecture. In Proc.of 4th International Conference on Autonomous Agents, pages 459--466, New York, USA, June 2000. Google ScholarDigital Library
- O. Marin, P. Sens, J. P. Briot, and Z. Guessoum. Towards adaptive fault-tolerance for distributed multi-agents systems. In Proc. Fourth European Research Seminar on Advances in Distributed Systems (ERSADS'01), pages 195--201, Bertinoro, Italy, May 2001.Google Scholar
- S. Mellouli, B. Moulin, and G. W. Mineau. Towards a modelling methodology for fault-tolerant multi-agent systems. In Informatica Journal 28, pages 31--40, 2004.Google Scholar
- D. Powell. Delta-4: A generic architecture for dependable distributed computing. In Springer Verlag, 1991. Google ScholarDigital Library
- P-M. Ricordel and Y. Demazeau. From analysis to deployment: A multi-agent platform survey. volume 1972 of LNAI, pages 93--106. Springer-Verlag, 2004. Google ScholarDigital Library
- R. van Renesse, K. Birman, and S. Maffeis. A flexible group communication system. Communications of the ACM, 39(4):76--83, 1996. Google ScholarDigital Library
Index Terms
- DimaX: a fault-tolerant multi-agent platform
Recommendations
Towards reliable multi-agent systems: An adaptive replication mechanism
Distributed cooperative applications are now increasingly being designed as a set of autonomous entities, named agents, which interact and coordinate (thus named a multi-agent system). Such applications are often very dynamic: new agents can join or ...
Deployment of A-globe multi-agent platform
AAMAS '06: Proceedings of the fifth international joint conference on Autonomous agents and multiagent systemsA-globe [2, 1] is a freeware, simulation-oriented multi-agent platform featuring agent migration, communication inaccessibility simulation and high scalability with moderate hardware requirements. Using dedicated simulation messaging together with 2D ...
A longitudinal survey of Internet host reliability
SRDS '95: Proceedings of the 14TH Symposium on Reliable Distributed SystemsAn accurate estimate of host reliability is important for correct analysis of many fault-tolerance and replication mechanisms. In a previous study, we estimated host system reliability by querying a large number of hosts to find how long they had been ...
Comments