Event Log Reconstruction Using Autoencoders

Nguyen, Hoang Thi Cam; Comuzzi, Marco

doi:10.1007/978-3-030-17642-6_28

Hoang Thi Cam Nguyen²⁴ &
Marco Comuzzi²⁵

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11434))

Included in the following conference series:

International Conference on Service-Oriented Computing

1652 Accesses
1 Citations

Abstract

Poor quality of process event logs prevents high quality business process analysis and improvement. Process event logs quality decreases because of missing attribute values or after incorrect or irrelevant attribute values are identified and removed. Reconstructing a correct value for these missing attributes is likely to increase the quality of event log-based process analyses. Traditional statistical reconstruction methods work poorly with event logs, because of the complex interrelations among attributes, events and cases. Machine learning approaches appear more suitable in this context, since they can learn complex models of event logs through training. This paper proposes a method for reconstructing missing attribute values in event logs based on the use of autoencoders. Autoencoders are a class of feed-forward neural networks that reconstruct their own input after having learnt a model of its latent distribution. They suit problems of unsupervised learning, such as the one considered in this paper. When reconstructing missing attribute values in an event log, in fact, one cannot assume that a training set with true labels is available for model training. The proposed method is evaluated on two real event logs against baseline methods commonly used in the literature for imputing missing values in large datasets.

This work received fundings from NRF Korea Project Number 2017076589.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)
Article Google Scholar
Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41(3), 16 (2009)
Article Google Scholar
Bayomie, D., Helal, I.M.A., Awad, A., Ezat, E., ElBastawissi, A.: Deducing case IDs for unlabeled event logs. In: Reichert, M., Reijers, H.A. (eds.) BPM 2015. LNBIP, vol. 256, pp. 242–254. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42887-1_20
Chapter Google Scholar
Beaulieu-Jones, B.K., Moore, J.H.: Missing data imputation in the electronic health record using deeply learned autoencoders. In: Pacific Symposium on Biocomputing, pp. 207–218. World Scientific (2017)
Google Scholar
Bose, R.J.C., Mans, R.S., van der Aalst, W.M.: Wanna improve process mining results? In: 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 127–134. IEEE (2013)
Google Scholar
Chen, X.W., Lin, X.: Big data deep learning: challenges and perspectives. IEEE Access 2, 514–525 (2014)
Article Google Scholar
Cheng, H.-J., Kumar, A.: Process mining on noisy logs-can log sanitization help to improve performance? Decis. Support Syst. 79, 138–149 (2015)
Article Google Scholar
Doersch, C.: Tutorial on variational autoencoders. Arxiv preprint (2016)
Google Scholar
Kingma, D.P., Adam, J.Ba.: A method for stochastic optimization. CoRR, abs/1412.6980 (2014)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. ArXiv e-prints, December 2013
Google Scholar
Mans, R.S., van der Aalst, W.M.P., Vanwersch, R.J.B., Moleman, A.J.: Process mining in healthcare: data challenges when answering frequently posed questions. In: Lenz, R., Miksch, S., Peleg, M., Reichert, M., Riaño, D., ten Teije, A. (eds.) KR4HC/ProHealth -2012. LNCS (LNAI), vol. 7738, pp. 140–153. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36438-9_10
Chapter Google Scholar
Nolle, T., Seeliger, A., Mühlhäuser, M.: Unsupervised anomaly detection in noisy business process event logs using denoising autoencoders. In: Calders, T., Ceci, M., Malerba, D. (eds.) DS 2016. LNCS (LNAI), vol. 9956, pp. 442–456. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46307-0_28
Chapter Google Scholar
Rogge-Solti, A., Mans, R.S., van der Aalst, W.M.P., Weske, M.: Improving documentation by repairing event logs. In: Grabis, J., Kirikova, M., Zdravkovic, J., Stirna, J. (eds.) PoEM 2013. LNBIP, vol. 165, pp. 129–144. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41641-5_10
Chapter Google Scholar
Rogge-Solti, A., Mans, R.S., van der Aalst, W.M.P., Weske, M.: Repairing event logs using timed process models. In: Demey, Y.T., Panetto, H. (eds.) OTM 2013. LNCS, vol. 8186, pp. 705–708. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41033-8_89
Chapter Google Scholar
Rogge-Solti, A., Senderovich, A., Weidlich, M., Mendling, J., Gal, A.: In log and model we trust? In: EMISA, pp. 91–94 (2016)
Google Scholar
Sakurada, M., Yairi, T.: Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of 2nd Workshop on Machine Learning for Sensory Data Analysis, MLSDA 2014, pp. 4–11 (2014)
Google Scholar
Shah, A.D., Bartlett, J.W., Carpenter, J., Nicholas, O., Hemingway, H.: Comparison of random forest and parametric imputation models for imputing missing data using mice: a caliber study. Am. J. Epidemiol. 179(6), 764–774 (2014)
Article Google Scholar
Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Shawe-Taylor, J., et al. (ed.) Advances in Neural Information Processing Systems, vol. 24, pp. 801–809 (2011)
Google Scholar
Suriadi, S., Andrews, R., ter Hofstede, A.H., Wynn, M.T.: Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf. Syst. 64, 132–150 (2017)
Article Google Scholar
Tax, N., Verenich, I., La Rosa, M., Dumas, M.: Predictive business process monitoring with LSTM neural networks. In: Dubois, E., Pohl, K. (eds.) CAiSE 2017. LNCS, vol. 10253, pp. 477–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59536-8_30
Chapter Google Scholar
van Eck, M.L., Lu, X., Leemans, S.J.J., van der Aalst, W.M.P.: PM\(^2\): a process mining project methodology. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 297–313. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19069-3_19
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Trusting Social, Ho Chi Minh City, Vietnam
Hoang Thi Cam Nguyen
Ulsan National Institute of Science and Technology, Ulsan, Republic of Korea
Marco Comuzzi

Authors

Hoang Thi Cam Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Marco Comuzzi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marco Comuzzi .

Editor information

Editors and Affiliations

Deakin University, Melbourne, VIC, Australia
Xiao Liu
University of Pau and Pays, Pau Cedex, France
Michael Mrissa
Fudan University, Shanghai Shi, China
Liang Zhang
LIRIS Lab, University Lyon 1, IUT, Villeurbanne Cedex, France
Djamal Benslimane
School of IT and Computer Science, University of Wollongong, Wollongong, NSW, Australia
Aditya Ghose
Harbin Institute of Technology, Harbin, China
Zhongjie Wang
Scientific and Technological Hub, Fondazione Bruno Kessler (FBK), Trento, Italy
Antonio Bucchiarone
Macquarie University, Sydney, NSW, Australia
Wei Zhang
Queen’s University, Kingston, ON, Canada
Ying Zou
Rochester Institute of Technology, Rochester, NY, USA
Qi Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, H.T.C., Comuzzi, M. (2019). Event Log Reconstruction Using Autoencoders. In: Liu, X., et al. Service-Oriented Computing – ICSOC 2018 Workshops. ICSOC 2018. Lecture Notes in Computer Science(), vol 11434. Springer, Cham. https://doi.org/10.1007/978-3-030-17642-6_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-17642-6_28
Published: 10 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17641-9
Online ISBN: 978-3-030-17642-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics