Abstract
Important and critical aspects of technical debt often surface at runtime only and are difficult to measure statically. This is a particular challenge for cloud applications because of their highly distributed nature. Fortunately, mature frameworks for collecting runtime data exist but need to be integrated.
In this paper, we report an experience from a project that implements a cloud application within Kubernetes on Azure. To analyze the runtime data of this software system, we instrumented our services with Zipkin for distributed tracing; with Prometheus and Grafana for analyzing metrics; and with fluentd, Elasticsearch and Kibana for collecting, storing and exploring log files. However, project team members did not utilize these runtime data until we created a unified and simple access using a chat bot.
We argue that even though your project collects runtime data, this is not sufficient to guarantee its usage: In order to be useful, a simple, unified access to different data sources is required that should be integrated into tools that are commonly used by team members.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Android. https://www.android.com/. Accessed 26 July 2018
Azure. https://azure.microsoft.com/de-de/. Accessed 26 July 2018
Bass, L., Weber, I., Zhu, L.: DevOps: A Software Architect’s Perspective. Addison-Wesley Professional, Boston (2015)
C++. https://isocpp.org/. Accessed 26 July 2018
Ciolkowski, M., Guzmán, L., Trendowicz, A., Vollmer, A.M.: Challenges in assessing technical debt based on dynamic runtime data, pp. 442–445, Prague, August 2018. https://doi.org/10.1109/SEAA.2018.00078
Docker. https://www.docker.com/. Accessed 26 July 2018
Elastic, Inc.: Elasticsearch. https://www.elastic.co/de/products/elasticsearch/. Accessed 26 July 2018
Elastic, Inc.: Kibana. https://www.elastic.co/de/products/kibana/. Accessed 26 July 2018
Fluentd. https://www.fluentd.org/. Accessed 26 July 2018
Go. https://golang.org/. Accessed 26 July 2018
Grafana. https://grafana.com/. Accessed 26 July 2018
iOS. https://www.apple.com/de/ios/. Accessed 26 July 2018
Kubernetes. https://kubernetes.io/. Accessed 26 July 2018
Lautenschlager, F., Philippsen, M., Kumlehn, A., Adersberger, J.: Chronix: long term storage and retrieval technology for anomaly detection in operational data. In: Proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST 2017), pp. 229–242 (2017)
Lua. https://www.lua.org/. Accessed 26 July 2018
Mattermost. https://mattermost.com/. Accessed 26 July 2018
Openshift. https://www.openshift.com/. Accessed 26 July 2018
Prometheus. http://prometheus.io/. Accessed 26 July 2018
Python. https://www.python.org/. Accessed 26 July 2018
Spring Boot. https://spring.io/projects/spring-boot/. Accessed 26 July 2018
Zipkin. https://zipkin.io/. Accessed 26 July 2018
Acknowledgments
We thank Robert Hoffmann from Deutsche Telekom for his support.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Lautenschlager, F., Ciolkowski, M. (2018). Making Runtime Data Useful for Incident Diagnosis: An Experience Report. In: Kuhrmann, M., et al. Product-Focused Software Process Improvement. PROFES 2018. Lecture Notes in Computer Science(), vol 11271. Springer, Cham. https://doi.org/10.1007/978-3-030-03673-7_33
Download citation
DOI: https://doi.org/10.1007/978-3-030-03673-7_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03672-0
Online ISBN: 978-3-030-03673-7
eBook Packages: Computer ScienceComputer Science (R0)