Abstract
This paper addresses the problem of communication among loosely coupled groups of nodes in distributed systems. We describe a novel proposal of logical communication topology based on skip list data structure. We enhance this structure to make it more resilient to failures. Its good self-stabilization characteristics are shown through extensive simulation experiments. We present this new concept in the context of our failure detection service, where we use it at a local communication level.
This work was supported by France Telecom, under project ”Brain” No. 21/06.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43, 225–267 (1996)
Hayashibara, N., Défago, X., Yared, R., Katayama, T.: The ϕ accrual failure detector. In: SRDS, pp. 66–78 (2004)
Brzeziński, J., Kobusiński, J.: A survey of failure detector protocols. Foundations of Computing and Decision Sciences 28, 65–81 (2003)
Reynal, M.: A short introduction to failure detectors for asynchronous distributed systems. SIGACT News, 53–70 (2005)
Freiling, F., Guerraoui, R., Kouznetsov, P.: The failure detector abstraction. Technical Report TR 2006-003, Department for Mathematics and Computer Science, University of Mannheim (2006)
van Renesse, R., Minsky, Y., Hayden, M.: A gossip-based failure detection service. In: Proc. of the Int. Conf. on Distributed Systems Platforms and Open Distributed Processing, pp. 55–70 (1998)
Stelling, P., DeMatteis, C., Foster, I.T., Kesselman, C., Lee, C.A., von Laszewski, G.: A fault detection service for wide area distributed computations. Cluster Computing 2, 117–128 (1999)
Gupta, I., Chandra, T.D., Goldszmidt, G.S.: On scalable and efficient distributed failure detectors. In: Proc. of 20th Annual ACM Symp. on Principles of Distributed Computing, pp. 170–179. ACM Press, New York (2001)
Bertier, M., Marin, O., Sens, P.: Implementation and performance evaluation of an adaptable failure detector. In: Proc. of the Int. Conf. on Dependable Systems and Networks, Washington, DC, pp. 354–363 (2002)
Hayashibara, N., Cherif, A., Katayama, T.: Failure detectors for large-scale distributed systems. In: Proc. of the 1st Workshop on Self-Repairing and Self-Configurable Distributed Systems (RCDS), 21st IEEE Int’l Symp. on Reliable Distributed Systems (SRDS-21), Osaka, Japan, pp. 404–409 (2002)
Dunagan, J., Harvey, N.J.A., Jones, M.B., Kosti, D., Theimer, M., Wolman, A.: FUSE: Lightweight guaranteed distributed failure notification. In: Proc. of the 6th Symp. on Operating Systems Design and Implementation, pp. 151–166 (2004)
Horita, Y., Taura, K., Chikayama, T.: A scalable and efficient self-organizing failure detector for grid applications. In: Proc. of 6th Int. Workshop on Grid Computing, pp. 202–210 (2005)
Cristian, F., Fetzer, C.: The timed asynchronous distributed system model. IEEE Trans. on Parallel and Distributed Systems 10, 642–657 (1999)
Dwork, C., Lynch, N., Stockmeyer, L.: Consensus in the presence of partial synchrony. Journal of the ACM 35, 288–323 (1988)
Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. Communication of the ACM 33, 668–676 (1990)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kobusiński, J., Gorski, F., Stempin, S. (2008). Skip Ring Topology in FAST Failure Detection Service. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2007. Lecture Notes in Computer Science, vol 4967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68111-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-68111-3_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68105-2
Online ISBN: 978-3-540-68111-3
eBook Packages: Computer ScienceComputer Science (R0)