Skip to main content
Log in

Automatically learning usage behavior and generating event sequences for black-box testing of reactive systems

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

We propose a novel technique based on recurrent artificial neural networks to generate test cases for black-box testing of reactive systems. We combine functional testing inputs that are automatically generated from a model together with manually-applied test cases for robustness testing. We use this combination to train a long short-term memory (LSTM) network. As a result, the network learns an implicit representation of the usage behavior that is liable to failures. We use this network to generate new event sequences as test cases. We applied our approach in the context of an industrial case study for the black-box testing of a digital TV system. LSTM-generated test cases were able to reveal several faults, including critical ones, that were not detected with existing automated or manual testing activities. Our approach is complementary to model-based and exploratory testing, and the combined approach outperforms random testing in terms of both fault coverage and execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://www.vestel.com.tr

  2. Some tasks require physical access to the system and cannot be automated.

  3. http://www.all4tec.net/

References

  • Aceto, L., Ingólfsdóttir, A., Larsen, K., Srba, J. (2007). Reactive systems: modelling, specification and verification. New York: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Agruss, C., & Johnson, B. (2000). Ad hoc software testing: a perspective on exploration and improvisation. In Florida institute of technology, pp. 68–69.

  • Amalfitano, D., Fasolino, A., Tramontana, P., Ta, B., Memon, A. (2015). MobiGUITAR: automated model-based testing of mobile apps. IEEE Software, 32(5), 53–59.

    Article  Google Scholar 

  • Barr, E., Harman, M., McMinn, P., Shahbaz, M., Yoo, S. (2015). The oracle problem in software testing: a survey. IEEE Transactions on Software Engineering, 41 (5), 507–525.

    Article  Google Scholar 

  • Belli, F. (2001). Finite state testing and analysis of graphical user interfaces. In Proceedings of 12th international symposium on software reliability engineering, pp. 34–43.

  • Belli, F., Budnik, C., White, L. (2006). Event-based modelling, analysis and testing of user interactions: approach and case study. Software Testing Verification and Reliability, 16(1), 3–32.

    Article  Google Scholar 

  • Berner, S., Weber, R., Keller, R. K. (2005). Observations and lessons learned from automated testing. In Proceedings of the 27th international conference on software engineering, pp. 571–579.

  • Bottou, L. (2012). Stochastic gradient descent tricks. In Neural networks: tricks of the trade, pp. 421–436. Springer.

  • Cho, K., van Merrienboer, B., Gülçehre, Ç., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 conference on empirical methods in natural language processing, pp. 1724–1734.

  • Cotter, A., Shamir, O., Srebro, N., Sridharan, K. (2011). Better mini-batch algorithms via accelerated gradient methods. In Advances in neural information processing systems, pp. 1647–1655.

  • Dalal, S. R., Jain, A., Karunanithi, N., Leaton, J. M., Lott, C. M., Patton, G. C., Horowitz, B. M. (1999). Model-based testing in practice. In Proceedings of the international conference on software engineering, pp. 285–294.

  • Elbaum, S., Rothermel, G., Karre, IIS. (2005). M.F.: leveraging user-session data to support web application testing. IEEE Transactions on Software Engineering, 31(3), 187–202.

    Article  Google Scholar 

  • Entin, V., Winder, M., Zhang, B., Christmann, S. (2011). Combining model-based and capture-replay testing techniques of graphical user interfaces: an industrial approach. In Proceedings of the 4th IEEE international conference on software testing, verification and validation workshops, pp. 572–577.

  • Fard, A., Mirzaaghaei, M., Mesbah, A. (2014). Leveraging existing tests in automated test generation for web applications. In Proceedings of the 29th ACM/IEEE international conference on automated software engineering, pp. 67–78.

  • Ferguson, R., & Korel, B. (1996). The chaining approach for software test data generation. ACM Transactions on Software Engineering and Methodology, 5(1), 63–86.

    Article  Google Scholar 

  • Gebizli, C., & Sozer, H. (2016). Automated refinement of models for model-based testing using exploratory testing. Software Quality Journal. Published online. https://doi.org/10.1007/s11219-016-9338-2.

  • Gebizli, C.S., & Sozer, H. (2014). Improving models for model-based testing based on exploratory testing. In Proceedings of the 6th IEEE workshop on software test automation, pp. 656–661. (COMPSAC Companion).

  • Gers, F., & Schmidhuber, E. (2001). LSTM recurrent networks learn simple context-free and context-sensitive languages. IEEE Transactions on Neural Networks, 12(6), 1333–1340.

    Article  Google Scholar 

  • Gers, F., & Schmidhuber, J. (2000). Recurrent nets that time and count. In Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks, pp. 189–194.

  • Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv:1308.0850.

  • Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., Schmidhuber, J. (2017). LSTM: a search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, 28(10), 2222–2232.

    Article  MathSciNet  Google Scholar 

  • Guen, H. L., Marie, R., Thelin, T. (2004). Reliability estimation for statistical usage testing using Markov chains. In Proceedings of the 15th international symposium on software reliability engineering, pp. 54–65.

  • Hagan, M., Demuth, H., Beale, M. (1995). Neural network design. New York: PWS Publishing.

    Google Scholar 

  • Harel, D. (1987). Statecharts: a visual formalism for complex systems. Science of Computer Programming, 8(3), 231–274.

    Article  MathSciNet  MATH  Google Scholar 

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computing, 9(8), 1735–1780.

    Article  Google Scholar 

  • Itkonen, J. (2011). Empirical studies on exploratory software testing. Ph.D. thesis Aalto University.

  • Itkonen, J., Mantyla, M. V., Lassenius, C. (2007). Defect detection efficiency: test case based vs. exploratory testing. In First international symposium on empirical software engineering and measurement, pp. 61–70. IEEE computer society.

  • Štefanovič, J. (2000). A neural network algorithm for digital circuits test generation. In Proceedings of the European symposium on the state of the art in computational intelligence, pp. 56-60, Physica-Verlag HD, Heidelberg.

  • Bach, J. (2003). Exploratory testing explained. Tech. rep., http://www.satisfice.com/articles/et-article.pdf.

  • Kaner, C. (2006). Exploratory testing. In Quality assurance institute worldwide annual software testing conference.

  • Karpathy, A. (2015). char-rnn https://github.com/karpathy/char-rnn.

  • Kingma, D., & Ba, J. (2014). Adam: a method for stochastic optimization. arXiv:1412.6980.

  • Kirac, M., Aktemur, B., Sozer, H. (2018). VISOR: a fast image processing pipeline with scaling and translation invariance for test oracle automation of visual output systems. Journal of Systems and Software, 136, 266–277.

    Article  Google Scholar 

  • Lukac, Z., Zlokolica, V., Mlikota, B., Radonjic, M., Velikic, I. (2012). A testing methodology and system for functional verification of general HbbTV device. In Proceedings of the IEEE international conference on consumer electronics, pp. 325–326.

  • Marijan, D., Zlokolica, V., Teslic, N., Pekovic, V., Tekcan, T. (2010). Automatic functional TV set failure detection system. IEEE Transactions on Consumer Electronics, 56(1), 125–133. 10.1109/TCE.2010.5439135.

    Article  Google Scholar 

  • Meinke, K., & Sindhu, M.A. (2013). LBTest: a learning-based testing tool for reactive systems. In Proceedings of the 6th IEEE international conference on software testing, verification and validation, pp. 447–454.

  • Memon, A., Banerjee, I., Nguyen, B. N., Robbins, B. (2013). The first decade of GUI ripping: extensions, applications, and broader impacts. In Proceedings of the 20th working conference on reverse engineering, pp. 11–20.

  • Memon, A., Soffa, M., Pollack, M. (2001). Coverage criteria for GUI testing. ACM SIGSOFT Software Engineering Notes, 26(5), 256–267.

    Article  Google Scholar 

  • Mesbah, A., van Deursen, A., Roest, D. (2012). Invariant-based automatic testing of modern web applications. IEEE Transactions on Software Engineering, 38 (1), 35–53.

    Article  Google Scholar 

  • Michael, C., McGraw, G., Schatz, M. (2001). Generating software test data by evolution. IEEE Transactions on Software Engineering, 27(12), 1085–1110.

    Article  Google Scholar 

  • Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10), pp. 807–814.

  • Neto, A. C. D., Subramanyan, R., Vieira, M., Travassos, G.H. (2007). A survey on model-based testing approaches: a systematic review. In Proceedings of the 1st ACM international workshop on empirical assessment of software engineering languages and technologies, pp. 31–36.

  • Nguyen, B., & Memon, A. (2014). An observe-model-exercise* paradigm to test event-driven systems with undetermined input spaces. IEEE Transactions on Software Engineering, 40(3), 216–234.

    Article  Google Scholar 

  • Nguyen, B., Robbins, B., Banerjee, I., Memon, A. (2014). GUITAR: an innovative tool for automated testing of gui-driven software. Automated Software Engineering, 21(1), 65–105.

    Article  Google Scholar 

  • Pacheco, C., Lahiri, S., Ernst, M., Ball, T. (2006). Feedback-directed random test generation. In Proceedings of the 29th international conference on software engineering, pp. 396–405.

  • Peković, V., Teslić, N., Resetar, I., Tekcan, T. (2010). Test management and test execution system for automated verification of digital television systems. In IEEE International symposium on consumer electronics (ISCE 2010), pp. 1–6. https://doi.org/10.1109/ISCE.2010.5523721.

  • Rafi, D., Moses, K., Petersen, K., Mäntylä, M. (2012). Benefits and limitations of automated software testing: systematic literature review and practitioner survey. In Proceedings of the 7th international workshop on automation of software test, pp. 36–42.

  • Robinson, H. (1999). Finite state model-based testing on a shoestring. In Proceedings of the software testing and analysis and review west conference.

  • Robinson, H. (2000). Intelligent test automation – a model-based method for generating tests from a description of an application’s behavior. Software Testing and Quality Engineering Magazine, pp. 24–32.

  • Sak, H., Senior, A., Beaufays, F. (2014). Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Proceedings of the 15th annual conference of the international speech communication association, pp. 338–342.

  • Sivaraman, G., César, P., Vuorimaa, P. (2001). System software for digital television applications. In IEEE International conference on multimedia and expo, pp. 784–787.

  • Sprenkle, A., Gibson, E., Sampath, S., Pollock, L. (2005). Automated replay and failure detection for web applications. In Proceedings of the 20th IEEE/ACM international conference on automated software engineering, pp. 253–262.

  • Tinkham, A., & Kaner, C. (2003). Exploring exploratory testing. In Proceedings of the software testing and analysis and review east conference.

  • Tretmans, J. (2011). Formal methods for eternal networked software systems, Springer, Berlin.

  • Werbos, P. J. (1990). Backpropagation through time: what it does and how to do it. Proceedings of the IEEE, 78(10), 1550–1560.

    Article  Google Scholar 

  • Whittaker, J., & Thomason, M. (1994). A Markov chain model for statistical software testing. IEEE Transactions on Software Engineering, 20(10), 812–824.

    Article  Google Scholar 

  • Wohlin, C., Runeson, P., Host, M., Ohlsson, M., Regnell, B., Wesslen, A. (2012). Experimentation in software engineering. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Wong, W., Debroy, V., Golden, R., Xu, X., Thuraisingham, B. (2012). Effective software fault localization using an RBF neural network. IEEE, Transactions on Reliability, 61(1), 149–169.

    Article  Google Scholar 

  • Wong, W., & Qi, Y. (2009). Bp neural network-based effective fault localization. International Journal of Software Engineering and Knowledge Engineering, 19(4), 573–597.

    Article  Google Scholar 

  • Xie, T., & Notkin, D. (2006). Tool-assisted unit-test generation and selection based on operational abstractions. Automated Software Engineering, 13(3), 345–371.

    Article  Google Scholar 

  • Wu, Y., & et al. (2016). Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144.

Download references

Acknowledgment

We would like to thank the software developers, test engineers, and technicians at Vestel Electronics for sharing their resources with us and supporting our case study. We also thank the anonymous reviewers for their comments on this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Furkan Kıraç.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kıraç, M.F., Aktemur, B., Sözer, H. et al. Automatically learning usage behavior and generating event sequences for black-box testing of reactive systems. Software Qual J 27, 861–883 (2019). https://doi.org/10.1007/s11219-018-9439-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-018-9439-1

Keywords

Navigation