Skip to main content

A Taxonomy of Anomalies in Distributed Cloud Systems: The CRI-Model

  • Chapter
  • First Online:
  • 777 Accesses

Part of the book series: Studies in Computational Intelligence ((SCI,volume 737))

Abstract

Anomaly Detection (AD) in distributed cloud systems is the process of identifying unexpected (i.e. anomalous) behaviour. Many approaches from machine learning to statistical methods exist to detect anomalous data instances. However, no generic solutions exist for identifying appropriate metrics for monitoring and choosing adequate detection approaches. In this paper, we present the CRI-Model (Change, Rupture, Impact), which is a taxonomy based on a study of anomaly types in the literatureand an analysis of system outages in major cloud and web-portal companies. The taxonomy can be used as an anlaysis-tool on identified anomalies to discover gaps in the AD state of a system or determine components most often affected by a particular anomaly type. While the dimensions of the taxonomy are fixed, the categories can be adapted to different domains. We show the applicability of the taxonomy to distributed cloud systems using a large dataset of anomaly reports from a software company. The adaptability is further shown for the production automation domain, as a first attempt to generalize the taxonomy to other distributed systems.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Avizienis, A., Laprie, J.-C., Randell, B.: Dependability and its threats: a taxonomy. In: Proceedings of IFIP 18th World Computer Congress, pp. 91–120 (2004)

    Google Scholar 

  2. Baddar, S., Merlo, A., Migliardi, M.: Anomaly detection in computer networks: a state-of-the-art review. J. Wirel. Mobile Netw. Ubiquitous Comput. Dependable Appl. (JoWUA) 5(4), 29–64 (2014)

    Google Scholar 

  3. Barford, P. et al.: A signal analysis of network traffic anomalies. In: The Second ACM SIGCOMM Workshop, pp. 71–82 (2002)

    Google Scholar 

  4. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–15 (2009)

    Article  Google Scholar 

  5. Ghosh, S., Reilly, D.L.: Credit card fraud detection with a neural-network. In: 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences, vol. 3, pp. 621–630 (1994)

    Google Scholar 

  6. Goldstein, M., Uchida, S.: A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PloS one 11(4), 1–31 (2016). e0152173

    Google Scholar 

  7. Ibidunmoye, O., Hernández-Rodriguez, F., Elmroth, E.: Performance anomaly detection and bottleneck identification. ACM Comput. Surv. 48(1), 1–35 (2015)

    Article  Google Scholar 

  8. Kumar, M., Ghani, R., Mei, Z.-S.: Data mining to predict and prevent errors in health insurance claims processing. In: The 16th ACM SIGKDD International Conference, pp. 65–74 (2010)

    Google Scholar 

  9. Ladiges, J., et al.: Evolution management of production facilities by semiautomated requirement verification. at-Automatisierungstechnik 62(11), 781–793 (2014)

    Google Scholar 

  10. Mazel, J., Fontugne, R., Fukuda, K.: A taxonomy of anomalies in backbone network traffic. In: IWCMC 2014—10th Int. Wireless Communications and Mobile Computing Conference, pp. 30–36 (2014)

    Google Scholar 

  11. Mirkovic, J., Reiher, P.: A taxonomy of DDoS attack and DDoS defense mechanisms. SIGCOMM Comp. Comm. Rev. 34(2), 39–53 (2004)

    Article  Google Scholar 

  12. Nielsen, J.: Usability Engineering. Elsevier (1994)

    Google Scholar 

  13. Pertet, S., et al.: Causes of failure in web applications, Parallel Data Laboratory December, pp. 1–19 (2005)

    Google Scholar 

  14. Plonka, D., Barford, P.: Network anomaly confirmation, diagnosis and remediation. In: 47th Annual Allerton Conference on Communication, Control, and Computing, pp. 128–135 (2009)

    Google Scholar 

  15. Tobergte, D., Curtis, S.: Why Internet services fail and what can be done about these. J. Chem. Inf. Model. 53(9), 1689–1699 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kim Reichert .

Editor information

Editors and Affiliations

Appendices

Appendix A: Literature Sources Used for the Taxonomy

  • Barford, P. et al.: A signal analysis of network traffic anomalies. In: the second ACM SIGCOMM Workshop, 71–82 (2002)

  • Cheng, H. et al.: Detection and Characterization of Anomalies in Multivariate Time Series. In: Proc. SIAM, 413–424 (2009)

  • Cohen, I. et al.: Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control. In: Proc. of the 7th symp. on Operating systems design and implementation. 4. (2004)

  • Düllmann, T.: Performance Anomaly Detection in Microservice Architectures Under Continuous Change. Masterthesis. University of Stuttgart, 2017

  • Dunning, T. and Friedman, E.: Practical Machine Learning: A New Look At Anomaly Detection. (2014), p. 65

  • Fu, Q. et al.: Execution anomaly detection in distributed systems through unstructured log analysis. In: Proce.—IEEE International Conference on Data Mining December, 149–158 (2009)

  • Goldstein, M. and Uchida, S.: A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data. In: PloS one 11.4 (2016): e0152173 11.April, 1–31 (2016)

  • Gu, X. and Wang, H.: Online anomaly prediction for robust cluster systems. In: Proc.—International Conference on Data Engineering, 1000–1011 (2009)

  • Guan, Q. et al.: Efficient and accurate anomaly identification using reduced metric space in utility clouds. In: Proc.—IEEE 7th Int. Conf. on Networking, Architecture and Storage, 207–216 (2012)

  • Gupta, M. et al.: Context-Aware Time Series Anomaly Detection for Complex Systems. In: Proc. of the SDM Workshop on Data Mining for Service and Maintenance, 14–22 (2013)

  • Hole, K.: Anomaly detection with htm. In: Anti-fragile ICT Systems. Springer International Publishing, 2016. Chap. Anomaly de, pp. 125–132

  • Ibidunmoye, O., Hernández-Rodriguez, F., and Elmroth, E.: Performance Anomaly Detection and Bottleneck Identification. In: ACM Computing Surveys 48.1, 1–35 (2015)

  • Munawar, M. et al.: Filtering system metrics for minimal correlation-based selfmonitoring. In: IEEE Int. Conf. on Self-Adaptive and Self-Organizing Systems, 233–242 (2009)

  • Sharma, A. et al.: Fault detection and localization in distributed systems using invariant relationships. In: IEEE/IFIP Int. Conf. on Dependable Systems and Networks 1, 1–8 (2013)

  • Sheth, A. et al.: Mojo: A Distributed Physical Layer Anomaly Detection System for 802.11 WLANs. In: Proc. of the 4th int. conf. on Mobile systems, applications and services, 191 (2006)

  • Smith, D., Guan, Q., and Fu, S.: An anomaly detection framework for autonomic management of compute cloud systems. In: Proc.—Int. Computer Software and Applications Conf. 376–381 (2010)

  • Takeishi, N. and Yairi, T.: Anomaly detection from multivariate time-series with sparse representation. In: Proc.—IEEE Int. Conf. on Systems, Man and Cybernetics 2014-Janua.January, 2651–2656 (2014)

  • Tan, Y. et al.: PREPARE : Predictive Performance Anomaly Prevention for Virtualized Cloud Systems. In: Distributed Computing Systems (ICDCS). Vcl, pp. 285–294 (2012)

  • Thottan, M. and Ji, C.: Proactive anomaly detection using distributed intelligent agents. In: Network, IEEE October, 21–27 (1998)

  • Wang, T. et al.: Fault Detection for Cloud Computing Systems with Correlation Analysis. In: 652–658 (2015)

Appendix B: Reports of System Outages

Table 3 The List of system outages reports from (software) companies

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Reichert, K., Pokahr, A., Hohenberger, T., Haubeck, C., Lamersdorf, W. (2018). A Taxonomy of Anomalies in Distributed Cloud Systems: The CRI-Model. In: Ivanović, M., Bădică, C., Dix, J., Jovanović, Z., Malgeri, M., Savić, M. (eds) Intelligent Distributed Computing XI. IDC 2017. Studies in Computational Intelligence, vol 737. Springer, Cham. https://doi.org/10.1007/978-3-319-66379-1_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66379-1_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66378-4

  • Online ISBN: 978-3-319-66379-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics