skip to main content
Skip header Section
The Handbook of Multimodal-Multisensor Interfaces: Foundations, User Modeling, and Common Modality Combinations - Volume 1April 2017
Publisher:
  • Association for Computing Machinery and Morgan & Claypool
ISBN:978-1-970001-67-9
Published:24 April 2017
Pages:
662
Appears In:
ACMACM Books
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

The Handbook of Multimodal-Multisensor Interfaces provides the first authoritative resource on what has become the dominant paradigm for new computer interfaces-user input involving new media (speech, multi-touch, gestures, writing) embedded in multimodal-multisensor interfaces. These interfaces support smartphones, wearables, in-vehicle, robotic, and many other applications that are now highly competitive commercially.

This edited collection is written by international experts and pioneers in the field. It provides a textbook for students, and a reference and technology roadmap for professionals working in this rapidly emerging area.

Volume 1 of the handbook presents relevant theory and neuroscience foundations for guiding the development of high-performance systems. Additional chapters discuss approaches to user modeling, interface design that supports user choice, synergistic combination of modalities with sensors, and blending of multimodal input and output. They also highlight an in-depth look at the most common multimodal-multisensor combinations- for example, touch and pen input, haptic and non-speech audio output, and speech co-processed with visible lip movements, gaze, gestures, or pen input. A common theme throughout is support for mobility and individual differences among users-including the world's rapidly growing population of seniors.

These handbook chapters provide walk-through examples and video illustrations of different system designs and their interactive use. Common terms are defined, and information on practical resources is provided (e.g., software tools, data resources) for hands-on project work to develop and evaluate multimodal-multisensor systems. In the final chapter, experts exchange views on a timely and controversial challenge topic, and how they believe multimodal-multisensor interfaces should be designed in the future to most effectively advance human performance.

Skip Table Of Content Section
prefatory
Preface

The content of this handbook would be most appropriate for graduate students, and of primary interest to students studying computer science and information technology, human-computer interfaces, mobile and ubiquitous interfaces, and related ...

PART I: Theory and neuroscience foundations
chapter
Theoretical foundations of multimodal interfaces and systems

This chapter discusses the theoretical foundations of multisensory perception and multimodal communication. It provides a basis for understanding the performance advantages of multimodal interfaces, as well as how to design them to reap these ...

chapter
The impact of multimodal-multisensory learning on human performance and brain activation patterns

The human brain is inherently a multimodal-multisensory dynamic learning system. All information that is processed by the brain must first be encoded through sensory systems and this sensory input can only be attained through motor movement. Although ...

PART II: Approaches to Design and User Modeling
chapter
Multisensory haptic interactions: understanding the sense and designing for it

Our haptic sense comprises both taction or cutaneous information obtained through receptors in the skin, and kinesthetic awareness of body forces and motions. Broadly speaking, haptic interfaces to computing systems are anything a user touches or is ...

chapter
A background perspective on touch as a multimodal (and multisensor) construct

This chapter will illustrate, through a series of examples, seven different perspectives of how touch input can be re-framed and re-conceived as a multimodal, multisensor construct.

These perspectives often can particularly benefit from considering the ...

chapter
Understanding and supporting modality choices

One of the characteristic benefits of multimodal-multisensor processing is that it gives users more freedom of choice than they would otherwise have. The most central type of choice concerns the use of input modalities: When performing a particular task ...

chapter
Using cognitive models to understand multimodal processes: the case for speech and gesture production

Multimodal behavior has been studied for a long time and in many fields, e.g., in psychology, linguistics, communication studies, education, and ergonomics. One of the main motivations has been to allow humans to use technical systems intuitively, in a ...

chapter
Multimodal feedback in HCI: haptics, non-speech audio, and their applications

Computer interfaces traditionally depend on visual feedback to provide information to users, with large, high-resolution screens the norm. Other sensory modalities, such as haptics and audio, have great potential to enrich the interaction between user ...

chapter
Multimodal technologies for seniors: challenges and opportunities

This chapter discusses interactive technologies in the service of seniors. Adults over 65 form one of the largest and most rapidly growing user groups in the industrialized society. Interactive technologies have been steadily improving in their ability ...

PART III: Common modality combinations
chapter
Gaze-informed multimodal interaction

Observe a person pointing out and describing something. Where is that person looking? Chances are good that this person also looks at what she is talking about and pointing at. Gaze is naturally coordinated with our speech and hand movements. By ...

chapter
Multimodal speech and pen interfaces

This chapter describes interfaces that enable users to combine digital pen and speech input for interacting with computing systems. Such interfaces promise natural and efficient interaction, taking advantage of skills that users have developed over many ...

chapter
Multimodal gesture recognition

Starting from the famous "Put That There!" demonstration prototype, developed by the Architecture Machine Group at MIT in the late 1970s, the growing potential of multimodal gesture interfaces in natural human-machine communication setups has stimulated ...

chapter
Audio and visual modality combination in speech processing applications

Chances are that most of us have experienced difficulty in listening to our interlocutor during face-to-face conversation while in highly noisy environments, such as next to heavy traffic or over the background of high-intensity speech babble or loud ...

PART IV: Multidisciplinary challenge topic: perspectives on learning with multimodal technology
chapter
Perspectives on learning with multimodal technology

To set the stage for this multidisciplinary discussion among experts on the challenging topic of learning with multimodal technology, weasksomebasic questions:

• What have neuroscience, cognitive and learning sciences, and humancomputer interaction ...

Cited By

  1. Dobiasch M, Oppl S, Stöckl M and Baca A (2023). Pegasos: a framework for the creation of direct mobile coaching feedback systems, Journal on Multimodal User Interfaces, 10.1007/s12193-023-00411-y, 18:1, (1-19), Online publication date: 1-Mar-2024.
  2. Berna Moya J, van Oosterhout A, Marshall M and Martinez Plasencia D (2024). HapticWhirl, a Flywheel-Gimbal Handheld Haptic Controller for Exploring Multimodal Haptic Feedback, Sensors, 10.3390/s24030935, 24:3, (935)
  3. Nazeer M, Salagrama S, Kumar P, Sharma K, Parashar D, Qayyum M and Patil G (2024). Improved Method for Stress Detection Using Bio-Sensor Technology and Machine Learning Algorithms, MethodsX, 10.1016/j.mex.2024.102581, (102581), Online publication date: 1-Jan-2024.
  4. Esteves J and Gonçalves B (2024). Designing Audio-Based Multimodal Interfaces for English Teaching: A Conceptual Model Based on an Integrative Literature Review Advances in Design and Digital Communication IV, 10.1007/978-3-031-47281-7_5, (53-66),
  5. Zhang Xiaojun , Corpas Pastor G and Zhang J (2023). Chapter 7. Videoconference interpreting goes multimodal Interpreting Technologies – Current and Future Trends, 10.1075/ivitra.37.07zha, (169-194)
  6. Crowley J, Coutaz J, Grosinger J, Vazquez-Salceda J, Angulo C, Sanfeliu A, Iocchi L and Cohn A A Hierarchical Framework for Collaborative Artificial Intelligence, IEEE Pervasive Computing, 10.1109/MPRV.2022.3208321, 22:1, (9-18)
  7. Xie C, Liu Y and Zhou H (2023). Exploration of Design Issues from an Embodied Perspective Design, User Experience, and Usability, 10.1007/978-3-031-35699-5_28, (384-395),
  8. Yam-Viramontes B, Cardona-Reyes H, González-Trejo J, Trujillo-Espinoza C and Mercado-Ravell D (2022). Commanding a drone through body poses, improving the user experience, Journal on Multimodal User Interfaces, 10.1007/s12193-022-00396-0, 16:4, (357-369), Online publication date: 1-Dec-2022.
  9. ACM
    Senaratne H, Oviatt S, Ellis K and Melvin G (2022). A Critical Review of Multimodal-multisensor Analytics for Anxiety Assessment, ACM Transactions on Computing for Healthcare, 10.1145/3556980, 3:4, (1-42), Online publication date: 31-Oct-2022.
  10. Choi Y, Kim J and Hong J Immersion Measurement in Watching Videos Using Eye-tracking Data, IEEE Transactions on Affective Computing, 10.1109/TAFFC.2022.3209311, 13:4, (1759-1770)
  11. Stingl R, Zimmerer C, Fischbach M and Latoschik M (2022). Are You Referring to Me? - Giving Virtual Objects Awareness 2022 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), 10.1109/ISMAR-Adjunct57072.2022.00139, 978-1-6654-5365-3, (671-673)
  12. Mangaroska K, Sharma K, Gašević D and Giannakos M (2021). Exploring students' cognitive and affective states during problem solving through multimodal data: Lessons learned from a programming activity, Journal of Computer Assisted Learning, 10.1111/jcal.12590, 38:1, (40-59), Online publication date: 1-Feb-2022.
  13. Oviatt S (2022). Multimodal Interaction, Interfaces, and Analytics Handbook of Human Computer Interaction, 10.1007/978-3-319-27648-9_22-1, (1-29),
  14. Šumak B, Brdnik S and Pušnik M (2021). Sensors and Artificial Intelligence Methods and Algorithms for Human–Computer Intelligent Interaction: A Systematic Mapping Study, Sensors, 10.3390/s22010020, 22:1, (20)
  15. Krüger N, Fischer K, Manoonpong P, Palinko O, Bodenhagen L, Baumann T, Kjærum J, Rano I, Naik L, Juel W, Haarslev F, Ignasov J, Marchetti E, Langedijk R, Kollakidou A, Jeppesen K, Heidtmann C and Dalgaard L (2021). The SMOOTH-Robot: A Modular, Interactive Service Robot, Frontiers in Robotics and AI, 10.3389/frobt.2021.645639, 8
  16. Mangaroska K, Martinez‐Maldonado R, Vesin B and Gašević D (2021). Challenges and opportunities of multimodal data in human learning: The computer science students' perspective, Journal of Computer Assisted Learning, 10.1111/jcal.12542, 37:4, (1030-1047), Online publication date: 1-Aug-2021.
  17. ACM
    Bhatti O, Barz M and Sonntag D EyeLogin - Calibration-free Authentication Method for Public Displays Using Eye Gaze ACM Symposium on Eye Tracking Research and Applications, (1-7)
  18. ACM
    Oviatt S, Lin J and Sriramulu A (2021). I Know What You Know: What Hand Movements Reveal about Domain Expertise, ACM Transactions on Interactive Intelligent Systems, 11:1, (1-26), Online publication date: 31-Mar-2021.
  19. Yeamkuan S and Chamnongthai K (2021). 3D Point-of-Intention Determination Using a Multimodal Fusion of Hand Pointing and Eye Gaze for a 3D Display, Sensors, 10.3390/s21041155, 21:4, (1155)
  20. Adam D and Okimoto M (2021). Multimodal Technology: Improving Accessibility of the Design of Home Appliances Advances in Usability, User Experience, Wearable and Assistive Technology, 10.1007/978-3-030-80091-8_53, (452-460),
  21. Biswas R, Barz M and Sonntag D (2020). Towards Explanatory Interactive Image Captioning Using Top-Down and Bottom-Up Features, Beam Search and Re-ranking, KI - Künstliche Intelligenz, 10.1007/s13218-020-00679-2, 34:4, (571-584), Online publication date: 1-Dec-2020.
  22. ACM
    Chen S and Epps J (2020). Multimodal Coordination Measures to Understand Users and Tasks, ACM Transactions on Computer-Human Interaction, 10.1145/3412365, 27:6, (1-26), Online publication date: 24-Nov-2020.
  23. ACM
    Zimmerer C, Wolf E, Wolf S, Fischbach M, Lugrin J and Latoschik M (2020). Finally on Par?! Multimodal and Unimodal Interaction for Open Creative Design Tasks in Virtual Reality ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 10.1145/3382507.3418850, 9781450375818, (222-231), Online publication date: 21-Oct-2020.
  24. Davila Delgado J, Oyedele L, Demian P and Beach T (2020). A research agenda for augmented and virtual reality in architecture, engineering and construction, Advanced Engineering Informatics, 10.1016/j.aei.2020.101122, 45, (101122), Online publication date: 1-Aug-2020.
  25. ACM
    Conati C, Lallé S, Rahman M and Toker D (2020). Comparing and Combining Interaction Data and Eye-tracking Data for the Real-time Prediction of User Cognitive Abilities in Visualization Tasks, ACM Transactions on Interactive Intelligent Systems, 10.1145/3301400, 10:2, (1-41), Online publication date: 30-May-2020.
  26. Sterpu G, Saam C and Harte N How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 10.1109/TASLP.2020.2980436, 28, (1052-1064)
  27. Setiawan D, Priambodo B, Desi Anasanti M, Hazidar A, Naf’an E, Masril M, Handriani I, Kudr Nseaf A and Pratama Putra Z (2019). Designing a Multimodal Graph System to Support Non-Visual Interpretation of Graphical Information, Journal of Physics: Conference Series, 10.1088/1742-6596/1339/1/012059, 1339:1, (012059), Online publication date: 1-Dec-2019.
  28. Meena Y, Cecotti H, Wong-Lin K and Prasad G (2019). Design and evaluation of a time adaptive multimodal virtual keyboard, Journal on Multimodal User Interfaces, 10.1007/s12193-019-00293-z, 13:4, (343-361), Online publication date: 1-Dec-2019.
  29. Biswas R, Mogadala A, Barz M, Sonntag D and Klakow D Automatic Judgement of Neural Network-Generated Image Captions Statistical Language and Speech Processing, (261-272)
  30. ACM
    Prange A and Sonntag D Modeling Cognitive Status through Automatic Scoring of a Digital Version of the Clock Drawing Test Proceedings of the 27th ACM Conference on User Modeling, Adaptation and Personalization, (70-77)
  31. Alonso V and de la Puente P (2018). System Transparency in Shared Autonomy: A Mini Review, Frontiers in Neurorobotics, 10.3389/fnbot.2018.00083, 12
  32. ACM
    Crowley J Put That There Proceedings of the 20th ACM International Conference on Multimodal Interaction, (4-4)
  33. Introduction The Handbook of Multimodal-Multisensor Interfaces, (1-16)
  34. Huang H, Gartner G, Krisp J, Raubal M and Van de Weghe N (2018). Location based services: ongoing evolution and research agenda, Journal of Location Based Services, 10.1080/17489725.2018.1508763, 12:2, (63-93), Online publication date: 3-Apr-2018.
  35. Zimmerer C, Fischbach M and Latoschik M (2018). Space Tentacles - Integrating Multimodal Input into a VR Adventure Game 2018 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 10.1109/VR.2018.8446151, 978-1-5386-3365-6, (745-746)
Contributors
  • Monash University
  • Imperial College London
  • Nuance Communications, Inc.
  • German Research Center for Artificial Intelligence (DFKI)
  • Athena - Research and Innovation Center in Information, Communication and Knowledge Technologies
  • German Research Center for Artificial Intelligence (DFKI)

Index Terms

  1. The Handbook of Multimodal-Multisensor Interfaces: Foundations, User Modeling, and Common Modality Combinations - Volume 1
      Index terms have been assigned to the content through auto-classification.

      Recommendations