Skip to main content
Log in

Multimodal Object-Based Environment Representation for Assistive Robotics

  • Published:
International Journal of Social Robotics Aims and scope Submit manuscript

Abstract

Autonomous robots are nowadays successfully used in industrial environments, where tasks follow predetermined plans and the world is a known (and closed) set of objects. The context of social robotics brings new challenges to the robot. First of all, the world is no longer closed. New objects can be introduced at any time, and it is now impossible to build an exaustive list of them nor having a precomputed set of descriptors. Moreover, natural interactions with a human being don’t follow any precomputed graph of sequences or grammar. To deal with the complexity of such an open world, a robot can no longer solely rely on its sensors data: a compact representation to comprehend its surrounding is needed. Our approach focuses on task independent environment representation where human-robot interactions are involved. We propose a global architecture bridging the gap between perception and semantic modalities through instances (physical realizations of semantic concepts). In this article, we describe a method for automatic generation of object-related ontology. Based on it, a practical formalization of the ill-defined notion of “context” is discussed. We then tackle human-robot interactions in our system through the description of user request processing. Finally, we illustrate the flow of our model on two showcases which demonstrate the validity of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

Notes

  1. https://dictionary.cambridge.org/dictionary/english/fork.

  2. In other words, it means that it can be detected by the perception module.

References

  1. Andor D, Alberti C, Weiss D, Severyn A, Presta A, Ganchev K, Petrov S, Collins M (2016) Globally normalized transition-based neural networks. arXiv preprint arXiv:1603.06042

  2. Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: The semantic web, Springer, pp 722–735

  3. Belpaeme T, Kennedy J, Ramachandran A, Scassellati B, Tanaka F (2018) Social robots for education: a review. Sci Robot 3(21):eaat5954

    Article  Google Scholar 

  4. Ben Amor H, Neumann G, Kamthe S, Kroemer O, Peters J (2014) Interaction primitives for human–robot cooperation tasks. In: 2014 IEEE international conference on robotics and automation (ICRA), pp 2831–2837, https://doi.org/10.1109/ICRA.2014.6907265

  5. Bracewell DB, Ren F, Kuroiwa S (2006) Towards knowledge about causal agents in wordnet. In: Proceedings of the 10th WSEAS international conference on Computers, World Scientific and Engineering Academy and Society (WSEAS), pp 564–568

  6. Brandes U (2001) A faster algorithm for betweenness centrality. J Math Sociol 25(2):163–177

    Article  Google Scholar 

  7. Brandes U, Fleischer D (2005) Centrality measures based on current flow. In: Annual symposium on theoretical aspects of computer science, Springer, pp 533–544

  8. Breux Y, Druon S, Zapata R (2018) From perception to semantics: An environment representation model based on human–robot interactions. In: 2018 27th IEEE international symposium on robot and human interactive communication (RO-MAN), IEEE, pp 672–677

  9. Buchholz S, Marsi E (2006) Conll-x shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, pp 149–164

  10. Carlson A, Betteridge J, Kisiel B, Settles B, Jr ERH, Mitchell TM (2010) Toward an architecture for never-ending language learning. In: Proceedings of the twenty-fourth conference on artificial intelligence (AAAI)

  11. Darwiche A (2018) Human-level intelligence or animal-like abilities ? Commun ACM 61(10):56–67

    Article  Google Scholar 

  12. De Boni M, Manandhar S (2002) Automated discovery of telic relations for wordnet. In: Proceedings of the first international WordNet conference

  13. De Marneffe MC, Manning CD (2008) Stanford typed dependencies manual. Tech. rep., Technical report, Stanford University

  14. De Marneffe MC, Dozat T, Silveira N, Haverinen K, Ginter F, Nivre J, Manning CD (2014) Universal stanford dependencies: a cross-linguistic typology. LREC 14:4585–4592

    Google Scholar 

  15. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp 248–255

  16. Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 40:35–41

    Article  Google Scholar 

  17. Frennert S, Eftring H, Östlund B (2017) Case report: implications of doing research on socially assistive robots in real homes. Int J Soc Robot 9(3):401–415. https://doi.org/10.1007/s12369-017-0396-9

    Article  Google Scholar 

  18. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093

  19. Kennedy J, Baxter P, Senft E, Belpaeme T (2016) Social robot tutoring for child second language learning. In: The eleventh ACM/IEEE international conference on human robot interaction, IEEE Press, pp 231–238

  20. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  21. Lang D, Paulus D (2014) Semantic maps for robotics. In: Proceedings of the workshop on AI Robotics at ICRA

  22. Lang D, Friedmann S, Häselich M, Paulus D (2014) Definition of semantic maps for outdoor robotic tasks. In: IEEE international conference on robotics and biomimetics, pp 2547–2552

  23. Lang D, Friedmann S, Hedrich J, Paulus D (2015) Semantic mapping for mobile outdoor robots. In: 14th IAPR international conference on machine vision applications, pp 325–328

  24. Leite I, Martinho C, Paiva A (2013) Social robots for long-term interaction: a survey. Int J Soc Robot 5(2):291–308. https://doi.org/10.1007/s12369-013-0178-y

    Article  Google Scholar 

  25. Lenat DB (1995) Cyc: a large-scale investment in knowledge infrastructure. Commun ACM 38(11):33–38

    Article  Google Scholar 

  26. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  27. Matuszek C, Cabral J, Witbrock MJ, DeOliveira J (2006) An introduction to the syntax and content of cyc. In: AAAI spring symposium: formalizing and compiling background knowledge and its applications to knowledge representation and question answering, pp 44–49

  28. Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41

    Article  Google Scholar 

  29. Mitchell T, Cohen W, Hruschka E, Talukdar P, Betteridge J, Carlson A, Dalvi B, Gardner M, Kisiel B, Krishnamurthy J, Lao N, Mazaitis K, Mohamed T, Nakashole N, Platanios E, Ritter A, Samadi M, Settles B, Wang R, Wijaya D, Gupta A, Chen X, Saparov A, Greaves M, Welling J (2015) Never-ending learning. In: Proceedings of the twenty-ninth AAAI conference on artificial intelligence

  30. Mukai T, Hirano S, Nakashima H, Kato Y, Sakaida Y, Guo S, Shigeyuki H (2010) Development of a nursing-care assistant robot riba that can lift a human in its arms, pp 5996 – 6001. https://doi.org/10.1109/IROS.2010.5651735

  31. Mur-Artal R, Tardós JD (2017) Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans Robot 33(5):1255–1262

    Article  Google Scholar 

  32. Nivre J, De Marneffe MC, Ginter F, Goldberg Y, Hajic J, Manning CD, McDonald RT, Petrov S, Pyysalo S, Silveira N, et al. (2016) Universal dependencies v1: a multilingual treebank collection. In: LREC. http://universaldependencies.org/

  33. Novischi A (2002) Accurate semantic annotations via pattern matching. In: FLAIRS conference, pp 375–379

  34. Pearl J, Mackenzie D (2018) The book of why: the new science of cause and effect. Basic books

  35. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. EMNLP 14:1532–43

    Google Scholar 

  36. Petrov S, Das D, McDonald R (2011) A universal part-of-speech tagset. arXiv preprint arXiv:1104.2086

  37. Pronobis A, Jensfelt P (2011) Hierarchical multi-modal place categorization. In: ECMR, pp 159–164

  38. Pronobis A, Jensfelt P (2012) Large-scale semantic mapping and reasoning with heterogeneous modalities. In: IEEE international conference on robotics and automation, pp 3515–3522

  39. Rother C, Kolmogorov V, Blake A (2004) Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans Gr (TOG) 23:309–314

    Article  Google Scholar 

  40. Ruhnau B (2000) Eigenvector-centrality ? a node-centrality? Soc Netw 22(4):357–365

    Article  Google Scholar 

  41. Sabelli AM, Kanda T, Hagita N (2011) A conversational robot in an elderly care center: An ethnographic study. In: 2011 6th ACM/IEEE international conference on human–robot interaction (HRI), pp 37–44. https://doi.org/10.1145/1957656.1957669

  42. Santorini B (1990) Part-of-speech tagging guidelines for the penn treebank project (3rd revision). Technical reports (CIS), p 570. http://www.clips.ua.ac.be/pages/MBSP-tags

  43. Sciutti A, Bisio A, Nori F, Metta G, Fadiga L, Pozzo T, Sandini G (2012) Measuring human–robot interaction through motor resonance. Int J Soc Robot 4(3):223–234

    Article  Google Scholar 

  44. Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354

    Article  Google Scholar 

  45. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  46. Singh P, Lin T, Mueller ET, Lim G, Perkins T, Zhu WL (2002) Open mind common sense: knowledge acquisition from the general public. In: OTM confederated international conferences on the move to meaningful internet systems, Springer, pp 1223–1237

  47. Speer R, Havasi C (2012) Representing general relational knowledge in conceptnet 5. In: LREC, pp 3679–3686

  48. Sünderhauf N, Dayoub F, McMahon S, Talbot B, Schulz R, Corke P, Wyeth G, Upcroft B, Milford M (2016) Place categorization and semantic mapping on a mobile robot. In: IEEE international conference on robotics and automation, pp 5729–5736

  49. Sünderhauf N, Pham T, Latif Y, Milford M, Reid ID (2017) Meaningful maps with object-oriented semantic mapping. In: 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, pp 5079–5085. arXiv:1609.07849

  50. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A, et al. (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  51. Tenorth M (2011) Knowledge processing for autonomous robots. PhD thesis, Technische Universität München

  52. Tenorth M, Beetz M (2017) Representations for robot knowledge in the knowrob framework. Artif Intell 247:151–169

    Article  MathSciNet  Google Scholar 

  53. Thrun S, Burgard W, Fox D (2005) Probabilistic robotics. MIT Press, Cambridge

    MATH  Google Scholar 

  54. Wada K, Shibata T (2007) Living with seal robots? its sociopsychological and physiological influences on the elderly at a care house. IEEE Trans Robot 23(5):972–980

    Article  Google Scholar 

  55. Whelan T, Leutenegger S, Salas-Moreno RF, Glocker B, Davison AJ (2015) Elasticfusion: dense slam without a pose graph. In: Proceedings on robotics: science and systems, Rome, Italy

  56. Wielemaker J, Schrijvers T, Triska M, Lager T (2012) SWI-Prolog. Theory Pract Log Program 12(1–2):67–96

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yohan Breux.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Breux, Y., Druon, S. Multimodal Object-Based Environment Representation for Assistive Robotics. Int J of Soc Robotics 12, 807–826 (2020). https://doi.org/10.1007/s12369-019-00600-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12369-019-00600-4

Keywords

Navigation