Discovering likely invariants of distributed transaction systems for autonomic system management

Jiang, Guofei; Chen, Haifeng; Yoshihira, Kenji

doi:10.1007/s10586-006-0008-1

Discovering likely invariants of distributed transaction systems for autonomic system management

Published: October 2006

Volume 9, pages 385–399, (2006)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Guofei Jiang¹,
Haifeng Chen¹ &
Kenji Yoshihira¹

136 Accesses
49 Citations
3 Altmetric
Explore all metrics

Abstract

Large amount of monitoring data can be collected from distributed systems as the observables to analyze system behaviors. However, without reasonable models to characterize systems, we can hardly interpret such monitoring data effectively for system management. In this paper, a new concept named flow intensity is introduced to measure the intensity with which internal monitoring data reacts to the volume of user requests in distributed transaction systems. We propose a novel approach to automatically model and search relationships between the flow intensities measured at various points across the system. If the modeled relationships hold all the time, they are regarded as invariants of the underlying system. Experimental results from a real system demonstrate that such invariants widely exist in distributed transaction systems. Further we discuss how such invariants can be used to characterize complex systems and support autonomic system management.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Analysis of Consistency Properties of Distributed Transaction Systems in Maude

Interactive checks for coordination avoidance

Article 05 September 2020

Inconsistency-Tolerant Business Rules in Distributed Information Systems

References

M.K. Aguilera, J.C. Mogul, J.L. Wiener, P. Reynolds, and A. Muthitacharoen, Performance debugging for distributed systems of black boxes, in: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (2003) pp. 74–89.
http://phx.corporate-ir.net/phoenix.zhtml? c=97664&p=iro-news Article&ID$=$798960&highlight=
W. Brogan, Modern Control Theory, 3rd edn (Prentice Hall, 1990).
M. Chen, A. Accardi, E. Kiciman, J. Lloyd, D. Patterson, A. Fox, and E. Brewer, Path-based failure and evolution management, in: 1st USENIX Symposium on Networked Systems Design and Implementation (NSDI ’04), San Francisco, CA (March, 2004), pp. 309–322.
http://www.nttdocomo.com/files/presscenter/34_No14_Doc.pdf/
I. Cohen, S. Zhang, M. Goldszmidt, J. Symons, T. Kelly, and A. Fox, Capturing, indexing, clustering, and retrieving system history, SIGOPS Oper. Syst. Rev. 39(5) (2005) 105–118.
Article Google Scholar
M. Ernst, J. Cockrell, W. Griswold, and D. Notkin, Dynamically discovering likely program invariants to support program evolution. IEEE Trans. on Software Engineering 27(2) (2001) 99–123.
Article Google Scholar
J. Gertler, Fault Detection and Diagnosis in Engineering Systems (Marcel Dekker, New York, 1998).
Google Scholar
S. Hangal and M. Lam, Tracking down software bugs using automatic anomaly detection, in: Proceedings of the 24th International Conference on Software Engineering, (2002) pp. 291–301.
R. Isermann and P. Balle, Trends in the application of model-based fault detection and diagnosis of industrial process, Control Engineering Practice 5(5) (1997) 709–719.
Article Google Scholar
G. Jiang, H. Chen, and K. Yoshihira, Modeling and tracking of transaction flow dynamics for fault detection in complex systems, to appear in IEEE Trans. on Dependable and Secure Computing.
http://java.sun.com/products/JavaManagement/
L. Ljung, System Identification—Theory for The User, 2nd edn (Prentice Hall PTR, 1998).
J. O’Madadhain, D. Fisher, S. White, and Y. Boey, The jung (java universal network/graph) framework, Technical Report UCI-ICS 03-17, UC Irvine Information and Computer Science (2003). Available at jung.sourceforge.net
D. Oppenheimer, A. Ganapathi, and D. Patterson, Why do internet services fail, and what can be done about it, in: 4th Usenix Symposium on Internet Technologies and Systems (USITS03) (2003) pp. 1–16.
D. Patterson, A simple way to estimate the cost of downtime, in: Proceedings of LISA-2002: Sixteenth System Administration Conference (2002) pp. 185–188.
D. Patterson, A. Brown et al., Recovery-oriented computing (ROC): Motivation, definition, techniques, and case studies, Technical Report UCB//CSD-02-1175, UC Berkeley Computer Science, Available at roc.cs.berkley.edu (2002).
http://java.sun.com/developer/releases/petstore/
http://news.bbc.co.uk/2/hi/business/4395258.stm
A. Yemini and S. Kliger, High speed and robust event correlation, IEEE Communication Magazine, 34(5) (1996) 82–90.
Article Google Scholar
G. Zhen, G. Jiang, H. Chen, and K. Yoshihira, Tracking probabilistic correlation of monitoring data for fault detection in complex systems, in: The International Conference on Dependable Systems and Networks (DSN2006), Philadelphia, PA (June 2006).

Download references

Author information

Authors and Affiliations

NEC Laboratories America, 4 Independence Way, Princeton, NJ, 08540
Guofei Jiang, Haifeng Chen & Kenji Yoshihira

Authors

Guofei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Haifeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kenji Yoshihira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guofei Jiang.

Additional information

Guofei Jiang received the B.S. and Ph.D. degrees in electrical and computer engineering from Beijing Institute of Technology, China, in 1993 and 1998, respectively. During 1998–2000, he was a postdoctoral fellow in computer engineering at Dartmouth College, NH. He is currently a research staff member with the Robust and Secure Systems Group in NEC Laboratories America at Princeton, NJ. During 2000–2004, he was a research scientist in the Institute for Security Technology Studies at Dartmouth College. His current research focus is on distributed system, dependable and secure computing, system and information theory. He has published over 50 technical papers in these areas. He is an associate editor of IEEE Security and Privacy magazine and has served in the program committees of many conferences.

Haifeng Chen received the BEng and MEng degrees, both in automation, from Southeast University, China, in 1994 and 1997 respectively, and the PhD degree in computer engineering from Rutgers University, New Jersey, in 2004. He has worked as a researcher in the Chinese national research institute of power automation. He is currently a research staff member at NEC laboratory America, Princeton, NJ. His research interests include data mining, autonomic computing, pattern recognition and robust statistics.

Kenji Yoshihira received the B.E. in EE at University of Tokyo in 1996 and designed processor chips for enterprise computer at Hitachi Ltd. for five years. He employed himself in CTO at Investoria Inc. in Japan to develop an Internet service system for financial information distribution through 2002 and received the M.S. in CS at New York University in 2004. He is currently a research staff member with the Robust and Secure Systems Group in NEC Laboratories America, inc. in NJ. His current research focus is on distributed system and autonomic computing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, G., Chen, H. & Yoshihira, K. Discovering likely invariants of distributed transaction systems for autonomic system management. Cluster Comput 9, 385–399 (2006). https://doi.org/10.1007/s10586-006-0008-1

Download citation

Received: 20 March 2006
Revised: 25 June 2006
Accepted: 28 June 2006
Issue Date: October 2006
DOI: https://doi.org/10.1007/s10586-006-0008-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering likely invariants of distributed transaction systems for autonomic system management

Abstract

Access this article

Similar content being viewed by others

Automatic Analysis of Consistency Properties of Distributed Transaction Systems in Maude

Interactive checks for coordination avoidance

Inconsistency-Tolerant Business Rules in Distributed Information Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discovering likely invariants of distributed transaction systems for autonomic system management

Abstract

Access this article

Similar content being viewed by others

Automatic Analysis of Consistency Properties of Distributed Transaction Systems in Maude

Interactive checks for coordination avoidance

Inconsistency-Tolerant Business Rules in Distributed Information Systems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation