Skip to main content
Log in

Teaching a pet-robot to understand user feedback through interactive virtual training tasks

  • Published:
Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Abstract

In this paper, we present a human-robot teaching framework that uses “virtual” games as a means for adapting a robot to its user through natural interaction in a controlled environment. We present an experimental study in which participants instruct an AIBO pet robot while playing different games together on a computer generated playfield. By playing the games and receiving instruction and feedback from its user, the robot learns to understand the user’s typical way of giving multimodal positive and negative feedback. The games are designed in such a way that the robot can reliably predict positive or negative feedback based on the game state and explore its user’s reward behavior by making good or bad moves. We implemented a two-staged learning method combining Hidden Markov Models and a mathematical model of classical conditioning to learn how to discriminate between positive and negative feedback. The system combines multimodal speech and touch input for reliable recognition. After finishing the training, the system was able to recognize positive and negative reward with an average accuracy of 90.33%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. AIBO remote framework. http://openr.AIBO.com

  2. Austermann, A., & Yamada, S. (2008). “Good robot, bad robot” - analyzing user’s feedback in a human-robot teaching task. In proceedings of the IEEE international symposium on robot and human interactive communication 2007 (RO-MAN 08), (pp. 41–46).

  3. Balkenius, C., & Morn, J. (1998). Computational models of classical conditioning: A comparative study. In Proceedings of the fifth international conference on simulation of adaptive behavior, (pp. 348–353).

  4. Ballard, D.H., & Yu, C. (2003). A multimodal learning interface for word acquisition. In Proceedings of the IEEE international conference on acoustics, speech, and signal processing, (pp. 1520–6149).

  5. Blumberg, B., Downie, M., Ivanov, Y., Berlin, M., Johnson, M.P., & Tomlinson, B. (2002). Integrated learning for interactive synthetic characters. In Proceedings of the 29th annual conference on computer graphics and interactive techniques (SIGGRAPH 2002), (pp. 417–426),

  6. Fujita M., Kitano H. (1998) Development of an autonomous quadruped robot for robot entertainment. Autonomous Robots 5((1): 7–18

    Article  Google Scholar 

  7. Iwahashi, N. (2004). Active and unsupervised learning for spoken word acquisition through a multimodal interface. In Proceedings of the 13th IEEE international workshop on robot and human interactive communication (RO-MAN 2004), (pp. 437–442).

  8. Li, C., & Biswas, G. (2000). A bayesian approach to temporal Data clustering using Hidden Markov models. In Proceedings of the seventeenth international conference on machine learning (ICML 2000), (pp. 543–550).

  9. Kayikci, Z.K., Markert, H., & Palm, G. (2007). Neural associative memories and Hidden Markov Models for speech recognition. In Proceedings of the international joint conference on neural networks (IJCNN 2007), (pp. 1572–1577).

  10. Kim, E.S., & Scassellati, B. (2007). Learning to refine behavior using prosodic feedback. In Proceedings of the 6th IEEE international conference on development and learning (ICDL 2007), (pp. 205–210).

  11. Mondada, F., Franzi, E., & Ienne, P. (1993). Mobile robot miniaturisation: A tool for investigation in control algorithms. Experimental robotics III, In Proceedings of the 3rd international symposium on experimental robotics, (pp. 28–30).

  12. Pavlov, I.P. (1927). Conditioned reflexes: An investigation of the physiological activity of the cerebral cortex (G. V. Anrep, trans), New York: Oxford University Press.

  13. Rescorla R., Wagner A. (1972) A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In: Black A.H., Prokasy W.F. (eds) Classical conditioning II: Current research and theory. Appleton Century Crofts, New York, pp 64–99

    Google Scholar 

  14. Steels L., Kaplan F. (2001) AIBO’s first words : The social learning of language and meaning. Evolution of Communication 4(1): 3–32

    Article  Google Scholar 

  15. Thomaz, A.L., & Breazeal, C. (2006). Reinforcement learning with human teachers: Evidence of feedback and guidance with implications for learning performance. In Proceedings of the 21st national conference on artificial intelligence (AAAI ’06).

  16. Xin, M., & Sharlin, E. (2006). Sheep and wolves: Test bed for human-robot interaction. In CHI ’06 extended abstracts on human factors in computing systems, (pp. 1553–1558).

  17. Yamada, S., & Yamaguchi, T. (2004). Training AIBO like a dog, In Proceedings of the 13th international workshop on robot and human interactive communication (RO-MAN 2004), (pp. 431–436). Kurashiki, Japan.

  18. Young, S. et al. (2006). The HTK book HTK version 3, http://htk.eng.cam.ac.uk/

  19. The Julius speech recognition project, http://julius.sourceforge.jp/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anja Austermann.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Austermann, A., Yamada, S. Teaching a pet-robot to understand user feedback through interactive virtual training tasks. Auton Agent Multi-Agent Syst 20, 85–104 (2010). https://doi.org/10.1007/s10458-009-9095-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10458-009-9095-8

Keywords

Navigation