Skip to main content

Event Log Reconstruction Using Autoencoders

  • Conference paper
  • First Online:
Service-Oriented Computing – ICSOC 2018 Workshops (ICSOC 2018)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11434))

Included in the following conference series:

Abstract

Poor quality of process event logs prevents high quality business process analysis and improvement. Process event logs quality decreases because of missing attribute values or after incorrect or irrelevant attribute values are identified and removed. Reconstructing a correct value for these missing attributes is likely to increase the quality of event log-based process analyses. Traditional statistical reconstruction methods work poorly with event logs, because of the complex interrelations among attributes, events and cases. Machine learning approaches appear more suitable in this context, since they can learn complex models of event logs through training. This paper proposes a method for reconstructing missing attribute values in event logs based on the use of autoencoders. Autoencoders are a class of feed-forward neural networks that reconstruct their own input after having learnt a model of its latent distribution. They suit problems of unsupervised learning, such as the one considered in this paper. When reconstructing missing attribute values in an event log, in fact, one cannot assume that a training set with true labels is available for model training. The proposed method is evaluated on two real event logs against baseline methods commonly used in the literature for imputing missing values in large datasets.

This work received fundings from NRF Korea Project Number 2017076589.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    at: https://data.4tu.nl/repository/collection:event_logs_real.

  2. 2.

    https://github.com/pytorch.

  3. 3.

    https://github.com/hoangnguyen3892/event-log-reconstruction.

References

  1. Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)

    Article  Google Scholar 

  2. Batini, C., Cappiello, C., Francalanci, C., Maurino, A.: Methodologies for data quality assessment and improvement. ACM Comput. Surv. 41(3), 16 (2009)

    Article  Google Scholar 

  3. Bayomie, D., Helal, I.M.A., Awad, A., Ezat, E., ElBastawissi, A.: Deducing case IDs for unlabeled event logs. In: Reichert, M., Reijers, H.A. (eds.) BPM 2015. LNBIP, vol. 256, pp. 242–254. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42887-1_20

    Chapter  Google Scholar 

  4. Beaulieu-Jones, B.K., Moore, J.H.: Missing data imputation in the electronic health record using deeply learned autoencoders. In: Pacific Symposium on Biocomputing, pp. 207–218. World Scientific (2017)

    Google Scholar 

  5. Bose, R.J.C., Mans, R.S., van der Aalst, W.M.: Wanna improve process mining results? In: 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 127–134. IEEE (2013)

    Google Scholar 

  6. Chen, X.W., Lin, X.: Big data deep learning: challenges and perspectives. IEEE Access 2, 514–525 (2014)

    Article  Google Scholar 

  7. Cheng, H.-J., Kumar, A.: Process mining on noisy logs-can log sanitization help to improve performance? Decis. Support Syst. 79, 138–149 (2015)

    Article  Google Scholar 

  8. Doersch, C.: Tutorial on variational autoencoders. Arxiv preprint (2016)

    Google Scholar 

  9. Kingma, D.P., Adam, J.Ba.: A method for stochastic optimization. CoRR, abs/1412.6980 (2014)

    Google Scholar 

  10. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. ArXiv e-prints, December 2013

    Google Scholar 

  11. Mans, R.S., van der Aalst, W.M.P., Vanwersch, R.J.B., Moleman, A.J.: Process mining in healthcare: data challenges when answering frequently posed questions. In: Lenz, R., Miksch, S., Peleg, M., Reichert, M., Riaño, D., ten Teije, A. (eds.) KR4HC/ProHealth -2012. LNCS (LNAI), vol. 7738, pp. 140–153. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36438-9_10

    Chapter  Google Scholar 

  12. Nolle, T., Seeliger, A., Mühlhäuser, M.: Unsupervised anomaly detection in noisy business process event logs using denoising autoencoders. In: Calders, T., Ceci, M., Malerba, D. (eds.) DS 2016. LNCS (LNAI), vol. 9956, pp. 442–456. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46307-0_28

    Chapter  Google Scholar 

  13. Rogge-Solti, A., Mans, R.S., van der Aalst, W.M.P., Weske, M.: Improving documentation by repairing event logs. In: Grabis, J., Kirikova, M., Zdravkovic, J., Stirna, J. (eds.) PoEM 2013. LNBIP, vol. 165, pp. 129–144. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41641-5_10

    Chapter  Google Scholar 

  14. Rogge-Solti, A., Mans, R.S., van der Aalst, W.M.P., Weske, M.: Repairing event logs using timed process models. In: Demey, Y.T., Panetto, H. (eds.) OTM 2013. LNCS, vol. 8186, pp. 705–708. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41033-8_89

    Chapter  Google Scholar 

  15. Rogge-Solti, A., Senderovich, A., Weidlich, M., Mendling, J., Gal, A.: In log and model we trust? In: EMISA, pp. 91–94 (2016)

    Google Scholar 

  16. Sakurada, M., Yairi, T.: Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of 2nd Workshop on Machine Learning for Sensory Data Analysis, MLSDA 2014, pp. 4–11 (2014)

    Google Scholar 

  17. Shah, A.D., Bartlett, J.W., Carpenter, J., Nicholas, O., Hemingway, H.: Comparison of random forest and parametric imputation models for imputing missing data using mice: a caliber study. Am. J. Epidemiol. 179(6), 764–774 (2014)

    Article  Google Scholar 

  18. Socher, R., Huang, E.H., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Shawe-Taylor, J., et al. (ed.) Advances in Neural Information Processing Systems, vol. 24, pp. 801–809 (2011)

    Google Scholar 

  19. Suriadi, S., Andrews, R., ter Hofstede, A.H., Wynn, M.T.: Event log imperfection patterns for process mining: towards a systematic approach to cleaning event logs. Inf. Syst. 64, 132–150 (2017)

    Article  Google Scholar 

  20. Tax, N., Verenich, I., La Rosa, M., Dumas, M.: Predictive business process monitoring with LSTM neural networks. In: Dubois, E., Pohl, K. (eds.) CAiSE 2017. LNCS, vol. 10253, pp. 477–492. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59536-8_30

    Chapter  Google Scholar 

  21. van Eck, M.L., Lu, X., Leemans, S.J.J., van der Aalst, W.M.P.: PM\(^2\): a process mining project methodology. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 297–313. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19069-3_19

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Comuzzi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, H.T.C., Comuzzi, M. (2019). Event Log Reconstruction Using Autoencoders. In: Liu, X., et al. Service-Oriented Computing – ICSOC 2018 Workshops. ICSOC 2018. Lecture Notes in Computer Science(), vol 11434. Springer, Cham. https://doi.org/10.1007/978-3-030-17642-6_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-17642-6_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-17641-9

  • Online ISBN: 978-3-030-17642-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics