Abstract
Data stream mining is the process of extracting knowledge from continuous sequences of data. It differs from conventional data mining in that a stream is potentially unbounded, data points arrive online and each data point can be examined only once. Furthermore, in non-stationary environments the statistical properties of the data can change over time. This paper presents a bio-inspired approach to clustering non-stationary data streams. The proposed algorithm, Ant-Colony Stream Clustering (ACSC), is based on the concept of artificial ants which identify clusters as nests of micro-clusters in dense areas of the data. Micro-clusters are N-dimensional spheres with a maximum radius \(\varepsilon \). In ACSC the \(\varepsilon \)-neighbourhood, crucial in density clustering, is adaptive and doesn’t require expert, data-dependent tuning. The algorithm uses the sliding window model and summary statistics for each window are stored offline. Experimental results over real and synthetic datasets show that the clustering quality of ACSC is comparable or favourable to leading stream-clustering algorithms while requiring fewer parameters and considerably less computation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases, vol. 29. pp. 81–92. VLDB ’03, VLDB Endowment (2003). http://dl.acm.org/citation.cfm?id=1315451.1315460
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, vol. 96, pp. 226–231 (1996)
Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: SDM, vol. 6, pp. 328–339. SIAM (2006)
Tu, L., Chen, Y.: Stream data clustering based on grid density and attraction. ACM Trans. Knowl. Discov. Data (TKDD) 3(3), 12 (2009)
Wan, L., Ng, W.K., Dang, X.H., Yu, P.S., Zhang, K.: Density-based clustering of data streams at multiple resolutions. ACM Trans. Knowl. Discov. from Data (TKDD) 3(3), 14 (2009)
Forestiero, A., Pizzuti, C., Spezzano, G.: A single pass algorithm for clustering evolving data streams based on swarm intelligence. Data Min. Knowl. Discov. 26(1), 1–26 (2013)
Reynolds, C.W.: Flocks, herds and schools: A distributed behavioral model. In: ACM SIGGRAPH Computer Graphics, vol. 21, pp. 25–34. ACM (1987)
Masmoudi, N., Azzag, H., Lebbah, M., Bertelle, C., Ben Jemaa, M.: How to use ants for data stream clustering. In: Proceedings of 2015 IEEE Congress on Evolutionary Computation, pp. 656–663 (2015)
Labroche, N.: Fast ant-inspired clustering algorithm for web usage mining. In: Information Processing and Management of Uncertainty (2006)
Deneubourg, J.L., Goss, S., Franks, N., Sendova-Franks, A., Detrain, C., Chrétien, L.: The dynamics of collective sorting robot-like ants and ant-like robots. In: Proceedings of the 1st International Conference on Simulation of Adaptive Behavior From Animals to Animats, pp. 356–363 (1991)
Handl, J., Knowles, J., Dorigo, M.: Ant-based clustering and topographic mapping. Artif. Life 12(1), 35–62 (2006)
Handl, J., Meyer, B.: Ant-based and swarm-based clustering. Swarm Intell. 1(2), 95–113 (2007)
Hartmann, V.: Evolving agent swarms for clustering and sorting. In: Proceedings of the 7th Annual conference on Genetic and Evolutionary Computation, pp. 217–224. ACM (2005)
Dorigo, M., Birattari, M., Sttzle, T.: Ant colony optimization. Comput. Intell. Mag. IEEE 1(4), 28–39 (2006)
Runkler, T.A.: Ant colony optimization of clustering models. Int. J. Intell. Syst. 20(12), 1233–1251 (2005)
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: Moa: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)
Jardine, N., van Rijsbergen, C.J.: The use of hierarchic clustering in information retrieval. Inf. Storage Retr. 7(5), 217–240 (1971)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)
Souza, V.M.A., Silva, D.F., Gama, J., Batista, G.E.A.P.A.: Data stream classification guided by clustering on nonstationary environments and extreme verification latency. In: Proceedings of SIAM International Conference on Data Mining, pp. 873–881 (2015)
Acknowledgments
This work was funded by the Engineering and Physical Sciences Research Council (EPSRC) of U.K. under Grant EP/K001310/1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Fahy, C., Yang, S. (2017). Dynamic Stream Clustering Using Ants. In: Angelov, P., Gegov, A., Jayne, C., Shen, Q. (eds) Advances in Computational Intelligence Systems. Advances in Intelligent Systems and Computing, vol 513. Springer, Cham. https://doi.org/10.1007/978-3-319-46562-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-46562-3_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46561-6
Online ISBN: 978-3-319-46562-3
eBook Packages: EngineeringEngineering (R0)