Skip to main content

Performance Diagnosis in Cloud Microservices Using Deep Learning

  • Conference paper
  • First Online:
Service-Oriented Computing – ICSOC 2020 Workshops (ICSOC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12632))

Included in the following conference series:

Abstract

Microservice architectures are increasingly adopted to design large-scale applications. However, the highly distributed nature and complex dependencies of microservices complicate automatic performance diagnosis and make it challenging to guarantee service level agreements (SLAs). In particular, identifying the culprits of a microservice performance issue is extremely difficult as the set of potential root causes is large and issues can manifest themselves in complex ways. This paper presents an application-agnostic system to locate the culprits for microservice performance degradation with fine granularity, including not only the anomalous service from which the performance issue originates but also the culprit metrics that correlate to the service abnormality. Our method first finds potential culprit services by constructing a service dependency graph and next applies an autoencoder to identify abnormal service metrics based on a ranked list of reconstruction errors. Our experimental evaluation based on injection of performance anomalies to a microservice benchmark deployed in the cloud shows that our system achieves a good diagnosis result, with 92% precision in locating culprit service and 85.5% precision in locating culprit metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Sock-shop - https://microservices-demo.github.io/.

  2. 2.

    Google Cloud Engine - https://cloud.google.com/compute/.

  3. 3.

    Istio - https://istio.io/.

  4. 4.

    Node-exporter - https://github.com/prometheus/node_exporter.

  5. 5.

    Cadvisor - https://github.com/google/cadvisor.

  6. 6.

    Prometheus - https://prometheus.io/.

  7. 7.

    stress-ng - https://kernel.ubuntu.com/~cking/stress-ng/.

References

  1. Brandón, Á., et al.: Graph-based root cause analysis for service-oriented and microservice architectures. J. Syst. Softw. 159, 110432 (2020)

    Article  Google Scholar 

  2. Chen, P., Qi, Y., Hou, D.: Causeinfer: automated end-to-end performance diagnosis with hierarchical causality graph in cloud environment. IEEE Trans. Serv. Comput. 12(02), 214–230 (2019)

    Article  Google Scholar 

  3. Di Francesco, P., Lago, P., Malavolta, I.: Migrating towards microservice architectures: an industrial survey. In: ICSA, pp. 29–2909 (2018)

    Google Scholar 

  4. Gan, Y., et al.: Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices. In: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2019, pp. 19–33 (2019)

    Google Scholar 

  5. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016). http://www.deeplearningbook.org

  6. Gulenko, A., et al.: Detecting anomalous behavior of black-box services modeled with distance-based online clustering. In: 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), pp. 912–915 (2018)

    Google Scholar 

  7. łgorzata Steinder, M., Sethi, A.S.: A survey of fault localization techniques in computer networks. Sci. Comput. Program. 53(2), 165–194 (2004)

    Google Scholar 

  8. Lin, J., et al.: Microscope: pinpoint performance issues with causal graphs in micro-service environments. In: Service-Oriented Computing, pp. 3–20 (2018)

    Google Scholar 

  9. Ma, M., et al.: Automap: diagnose your microservice-based web applications automatically. In: Proceedings of the Web Conference 2020, WWW 2020, pp. 246–258 (2020)

    Google Scholar 

  10. Mariani, L., et al.: Localizing faults in cloud systems. In: ICST, pp. 262–273 (2018)

    Google Scholar 

  11. Meng, Y., et al.: Localizing failure root causes in a microservice through causality inference. In: 2020 IEEE/ACM 28th International Symposium on Quality of Service (IWQoS), pp. 1–10. IEEE (2020)

    Google Scholar 

  12. Newman, S.: Building Microservices. O’Reilly Media Inc., Newton (2015)

    Google Scholar 

  13. Solé, M., Muntés-Mulero, V., Rana, A.I., Estrada, G.: Survey on models and techniques for root-cause analysis (2017)

    Google Scholar 

  14. Thalheim, J., et al.: Sieve: actionable insights from monitored metrics in distributed systems. In: Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference, pp. 14–27 (2017)

    Google Scholar 

  15. Wang, P., et al.: Cloudranger: root cause identification for cloud native systems. In: CCGRID, pp. 492–502 (2018)

    Google Scholar 

  16. Wu, L., et al.: MicroRCA: root cause localization of performance issues in microservices. In: NOMS 2020 IEEE/IFIP Network Operations and Management Symposium (2020)

    Google Scholar 

Download references

Acknowledgment

This work is part of the FogGuru project which has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 765452. The information and views set out in this publication are those of the author(s) and do not necessarily reflect the official opinion of the European Union. Neither the European Union institutions and bodies nor any person acting on their behalf may be held responsible for the use which may be made of the information contained therein.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, L., Bogatinovski, J., Nedelkoski, S., Tordsson, J., Kao, O. (2021). Performance Diagnosis in Cloud Microservices Using Deep Learning. In: Hacid, H., et al. Service-Oriented Computing – ICSOC 2020 Workshops. ICSOC 2020. Lecture Notes in Computer Science(), vol 12632. Springer, Cham. https://doi.org/10.1007/978-3-030-76352-7_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-76352-7_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-76351-0

  • Online ISBN: 978-3-030-76352-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics