Skip to main content
Log in

Bio-inspired algorithm for outliers detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

An essential activity to obtain valuable information to identify, for example, intrusions, faults, system failures, etc, is outliers detection. This paper proposes a bio-inspired algorithm able to detect anomaly data in distributed systems. Each data object is associated with a mobile agent that follows the well-known bio-inspired algorithm of flocking. The agents are randomly disseminated onto a virtual space where they move autonomously in order to form one or more flocks. Through a tailored similarity function, the agents associated with similar objects join in the same flock, whereas, the agents associated with dissimilar objects do not join in any flock. The objects associated with isolated agents or associated with agents grouped into flock with a number of entities lower than a given threshold, represent the outliers. Experimental results on synthetic and real data sets confirm the validity of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Acuna E, Rodriguez C (2004) A meta analysis study of outlier detection methods in classification Technical paper. Department of Mathematics, University of Puerto Rico at Mayaguez

    Google Scholar 

  2. Aggarwal CC (2013) Outlier analysis. Springer Science & Business Media

  3. Aggarwal CC, Yu PS (2001) Outlier detection for high dimensional data. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, ACM, New York, NY, USA, SIGMOD ’01, pp 37–46

  4. Aggarwal CC, Han J, Wang J, Yu PS (2007) On clustering massive data streams: A summarization paradigm. In: Data Streams - Models and Algorithms, pp 9–38

  5. Alam S, Dobbie G, Riddle P, Naeem MA (2010) A swarm intelligence based clustering approach for outlier detection. In: 2010 IEEE Congress on Evolutionary Computation (CEC), IEEE, pp 1–7

  6. Arning A, Agrawal R, Raghavan P (1996) A linear method for deviation detection in large databases. In: KDD, pp 164–169

  7. Asuncion A, Newman D (2007) Uci machine learning repository

  8. Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proceedings of the Twenty-first ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, ACM, New York, NY, USA, PODS ’02, pp 1–16

  9. Bay SD, Schwabacher M (2003) Mining distance-based outliers in near linear time with randomization and a simple pruning rule. In: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 29–38

  10. Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural to artificial systems, vol 4. Oxford university press, New York

  11. Cao F, Ester M, Qian W, Zhou A (2006) Density-based clustering over an evolving data stream with noise. In: Proceedings of the 2006 SIAM International Conference on Data Mining, pp 328–339

  12. Cui X, Potok TE (2006) A distributed agent implementation of multiple species flocking model for document partitioning clustering. Springer, Lecture Notes in Computer Science, vol 4149, pp 124–137

  13. Eberhart RC, Shi Y, Kennedy J (2001) Swarm Intelligence. Morgan Kaufmann

  14. Elahi M, Li K, Nisar W, Lv X, Wang H (2008) Efficient clustering-based outlier detection algorithm for dynamic data stream. In: FSKD (5), IEEE Computer Society, pp 298–304

  15. Ellabib I, Calamai PH, Basir O A (2007) Exchange strategies for multiple ant colony system. Inf Sci 177(5):1248–1264

    Article  Google Scholar 

  16. Eskin E, Arnold A, Prerau M, Portnoy L, o SS (2002) A geometric framework for unsupervised anomaly detection: Detecting intrusions in unlabeled data. In: Applications of Data Mining in Computer, Kluwer

  17. Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings 2 nd Int. Conf. on Knowledge Discovery and Data Mining(KDD’ 96), pp 226–231

  18. Folino G, Forestiero A, Spezzano G (2009) An adaptive flocking algorithm for performing approximate clustering. Inf Sci 179(18):3059–3078

    Article  Google Scholar 

  19. Forestiero A, Pizzuti C, Spezzano G (1) Flockstream: A bio-inspired algorithm for clustering evolving data streams. In: ICTAI, IEEE Computer Society

  20. Gupta M, Gao J, Aggarwal C, Han J (2014) Outlier detection for temporal data: A survey. IEEE Trans Knowl Data Eng 26(9):2250–2267

    Article  MATH  Google Scholar 

  21. Huang L, Nguyen X, Garofalakis M, Jordan MI, Joseph A, Taft N (2006) In-network pca and anomaly detection. In: Advances in Neural Information Processing Systems, pp 617–624

  22. Jindal R, Sharma SD, Manoj Sharma M (2013) A new technique to increase the working performance of the ant colony optimization algorithm. International Journal of Innovative Technology and Exploring Engineering 3(2):128–131

    Google Scholar 

  23. Khalilian M, Mustapha N (2010) Data stream clustering: Challenges and issues. CoRR abs/1006.5261

  24. Knorr EM, Ng RT (1998) Algorithms for mining distance-based outliers in large datasets. In: VLDB, Morgan Kaufmann, pp 392–403

  25. Liu B, Cai M, Yu J (2015) Swarm intelligence and its application in abnormal data detection. Informatica 39 (1)

  26. Locasto M E, Parekh J J, Stolfo S, Misra V (2004) Collaborative distributed intrusion detection

  27. Mohemmed AW, Zhang M, Browne WN (2010) Particle swarm optimisation for outlier detection. In: GECCO, ACM, pp 83–84

  28. Monmarch N, Slimane M, Venturini G (1999) On improving clustering in numerical databases with artificial ants. In: ECAL, Springer, Lecture Notes in Computer Science, vol 1674, pp 626–635

  29. Murugavel P, Punithavalli M (2011) Improved hybrid clustering and distance-based technique for outlier removal. Int J Comput Sci Eng (IJCSE) 3(1):333–339

    Google Scholar 

  30. Otey ME, Ghoting A, Parthasarathy S (2006) Fast distributed outlier detection in mixed-attribute data sets. Data Min Knowl Disc 12(2-3):203–228

    Article  MathSciNet  Google Scholar 

  31. Palpanas T, Papadopoulos D, Kalogeraki V, Gunopulos D (2003) Distributed deviation detection in sensor networks. ACM SIGMOD Rec 32(4):77–82

    Article  Google Scholar 

  32. Pokrajac D, Lazarevic A, Latecki LJ (2007) Incremental local outlier detection for data streams. In: CIDM, IEEE, pp 504–515

  33. Porras PA, Neumann PG (1997) Emerald: Event monitoring enabling response to anomalous live disturbances. In: Proceedings of the 20th national information systems security conference, pp 353–365

  34. Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. In: Chen W, Naughton JF, Bernstein PA (eds) SIGMOD Conference, ACM, pp 427–438, sIGMOD Record 29(2), June 2000

  35. Reynolds CW (1987) Flocks, herds and schools: A distributed behavioral model. In: Stone MC (ed) SIGGRAPH, ACM, pp 25–34

  36. Shafiq A, Gillian D, Riddle P (2008) An evolutionary particle swarm optimization algorithm for data clustering. In: Swarm Intelligence Symposium, IEEE, IEEE, pp 1–6

  37. Su L, Han W, Yang S, Zou P, Jia Y (2007) Continuous adaptive outlier detection on distributed data streams. In: High Performance Computing and Communications. Springer, pp 74–85

  38. Tang J, Chen Z, Fu AWC, Cheung DW (2007) Capabilities of outlier detection schemes in large datasets, framework and methodologies. Knowl Inf Syst 11 (1):45–84

    Article  Google Scholar 

  39. Zimek A, Schubert E, Kriegel HP (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min 5(5):363–387

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Agostino Forestiero.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Forestiero, A. Bio-inspired algorithm for outliers detection. Multimed Tools Appl 76, 25659–25677 (2017). https://doi.org/10.1007/s11042-017-4443-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4443-1

Keywords

Navigation