Abstract
We present a decentralized algorithm for online clustering analysis used for anomaly detection in self-monitoring distributed systems. In particular, we demonstrate the monitoring of a network of printing devices that can perform the analysis without the use of external computing resources (i.e. in-network analysis). We also show how to ensure the robustness of the algorithm, in terms of anomaly detection accuracy, in the face of failures of the network infrastructure on which the algorithm runs. Further, we evaluate the tradeoff in terms of overhead necessary for ensuring this robustness and present a method to reduce this overhead while maintaining the detection accuracy of the algorithm.
Similar content being viewed by others
References
Kiciman, E., Wang, Y.M.: Discovering correctness constraints for self-management of system configuration. In: Proceedings of the First International Conference on Autonomic Computing, pp. 28–35. New York, NY (2004)
Chen, M., Zheng, A.X., Lloyd, J., Jordan, M.I., Brewer, E.: Failure diagnosis using decision trees. In: Proceedings of the First International Conference on Autonomic Computing, pp. 36–37. New York, NY (2004)
Jiang, G., Chen, H., Ungureanu, C., Yoshihira, K.: Multi-resolution abnormal trace detection using varied-length n-grams and automata. In: Proceedings of the 2nd International Conference on Autonomic Computing, pp. 111–122. Seattle, WA (2005)
Jiang, G., Chen, H., Yoshihira, K.: Discovering likely invariants of distributed transaction systems for autonomic system management. Clust. Comput. 9(4), 385–399 (2006)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, New York (1988)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96) (1996)
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of 1998 ACM-SIGMOD Int. Conf. Management of Data, pp. 94–105. Seattle, Washington (1998)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of ACM SIGCOMM, pp. 149–160. San Diego, CA (2001)
Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content-addressable network. In: Proceedings of ACM SIGCOMM, pp. 161–172. San Diego, CA (2001)
Rowstron, A., Druschel, P.: Pastry: Scalable, descentralized object location and routing for large-scale peer-to-peer systems. In: Proceedings of IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), pp. 329–350. Heidelberg, Germany (2001)
Sagan, H.: Space-Filling Curves. Springer, Berlin (1994)
Schmidt, C., Parashar, M.: Flexible information discovery in descentralized distributed systems. In: 12th IEEE International Symposium on High Performance Distributed Computing (HPDC-12’03) (2003)
Moon, B., Jagadish, H., Faloutsos, C.: Analysis of the clustering properties of the Hilbert space-filling curve. IEEE Trans. Knowl. Data Eng. 13(1), 124–141 (2001)
Quiroz, A., Parashar, M., Gnanasambandam, N., Sharma, N.: Clustering analysis for the management of self-monitoring device networks. In: Proceedings of the International Conference on Autonomic Computing (ICAC 2008). Chicago (2008)
Bandyopadhyay, S., Gianella, C., Maulik, U., Kargupta, H., Liu, K., Datta, S.: Clustering distributed data streams in peer-to-peer environments. Inf. Sci. 176(14), 1952–1985 (2006)
Datta, S., Giannella, C., Kargupta, H.: K-means clustering over a large, dynamic network. In: Proceedings of the Sixth SIAM International Conference on Data Mining. Bethesda, MD (2006)
Li, M., Lee, W.C., Sivasubramaniam, A.: Pens: An algorithm for density-based clustering in peer-to-peer systems. In: Proceedings of the International Conference on Scalable Information Systems (INFOSCALE) (2006)
Cates, J.: Robust and efficient data management for a distributed hash table. Master’s thesis, Massachusetts Institute of Technology (2003)
Naor, M., Wieder, U.: A simple fault-tolerant distributed hash table. In: Proceedings of the Second International Workshop on Peer-to-Peer Systems (2003)
Takemoto, D., Tagashira, S., Fujita, S.: A fault-tolerant content addressable network. IEICE Trans. E89-D(6), 1923–1930 (2006)
Cohen, E., Shenker, S.: Replication strategies in unstructured peer-to-peer networks. In: Proceedings of SIGCOMM’02. Pittsburgh (2002)
Kangasharju, J., Ross, K.W., Antipolls, S., Turner, D.A.: Optimal content replication in p2p communities. Tech. rep., TU Darmstadt (2002)
Cuenca-Acuna, F.M., Martin, R.P., Nguyen, T.D.: Autonomous replication for high availability in unstructured p2p systems. Tech. rep., Rutgers University (2003)
On, G., Schmitt, J., Steinmetz, R.: The effectiveness of realistic replication strategies on quality of availability for peer-to-peer systems. In: Proceedings of the Third International Conference on Peer-to-Peer Computing (P2P’03) (2003)
Author information
Authors and Affiliations
Corresponding author
Additional information
The research presented in this paper is supported in part by National Science Foundation via grants numbers CNS 0305495, CNS 0426354, IIS 0430826 and ANI 0335244, and by Department of Energy via the grant number DE-FG02-06ER54857.
Rights and permissions
About this article
Cite this article
Quiroz, A., Gnanasambandam, N., Parashar, M. et al. Robust clustering analysis for the management of self-monitoring distributed systems. Cluster Comput 12, 73–85 (2009). https://doi.org/10.1007/s10586-008-0068-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-008-0068-5