Off-line simulation inspires insight: A neurodynamics approach to efficient robot task learning

doi:10.1016/j.neunet.2015.09.002

Neural Networks

Volume 72, December 2015, Pages 123-139

https://doi.org/10.1016/j.neunet.2015.09.002 Get rights and content

Abstract

There is currently an increasing demand for robots able to acquire the sequential organization of tasks from social learning interactions with ordinary people. Interactive learning-by-demonstration and communication is a promising research topic in current robotics research. However, the efficient acquisition of generalized task representations that allow the robot to adapt to different users and contexts is a major challenge. In this paper, we present a dynamic neural field (DNF) model that is inspired by the hypothesis that the nervous system uses the off-line re-activation of initial memory traces to incrementally incorporate new information into structured knowledge. To achieve this, the model combines fast activation-based learning to robustly represent sequential information from single task demonstrations with slower, weight-based learning during internal simulations to establish longer-term associations between neural populations representing individual subtasks. The efficiency of the learning process is tested in an assembly paradigm in which the humanoid robot ARoS learns to construct a toy vehicle from its parts. User demonstrations with different serial orders together with the correction of initial prediction errors allow the robot to acquire generalized task knowledge about possible serial orders and the longer term dependencies between subgoals in very few social learning interactions. This success is shown in a joint action scenario in which ARoS uses the newly acquired assembly plan to construct the toy together with a human partner.

Introduction

Currently, new generations of robots are built that are supposed to interact closely with ordinary people in their working and living environments. These robots have to master a wide variety of everyday tasks that cannot be completely designed in advance by experts as in traditional industrial applications (Schaal, 2007). A major challenge of current robotics research is thus to endow robots with an adaptive, efficient and user-friendly instruction method that would allow ordinary people to teach the robot new tasks in an open-ended manner. Ideally, naïve users may bring their own learning experiences in social interactions with other people to the robotics domain. Learning by observing and imitating others behaviors and their consequences is a powerful social learning mechanism for human-to-human knowledge transfer (Bandura, 1971). It is attractive for the robotics domain as well since learning by observation significantly speeds up skill acquisition compared to individual discovery in potentially dangerous trial-and-error learning. While a full-blown, human-like social learning capacity for robots still remains a distant goal, major progress has been made over the last decade in various research directions of the programming by demonstration approach (for review papers and collections see e.g., (Billard et al., 2008, Dautenhahn and Nehaniv, 2002)). Most robotics experiments thus far have focused on the level of transferring motor skills for object manipulation from human to robot. Some other research started at the more abstract level of learning serial tasks defined by a sequence of subgoals or events in domains like assembly work (Ikeuchi & Suehiro, 1994), navigation (Nicolescu & Matarić, 2003) or household manipulation (Pardowitz, Knoop, Dillmann, & Zöllner, 2007). Recent developments stress the importance of social learning cues such as verbal feedback and communicative gestures to guide the real-time learning process in an incremental manner (Otero et al., 2008, Thomaz and Breazeal, 2008). However, little attention has been paid thus far on the generality of the acquired task knowledge, and the efficiency of the learning process in terms of the number of demonstrations needed (Pardowitz et al., 2007, Wu and Demiris, 2010). To act as an intelligent and flexible co-worker it is not sufficient that the robot memorizes a demonstrated task, such as, for instance assembling a furniture or laying the dinner table, as a simple linear sequence of events. The robot should be able to adapt the serial order of task execution to the preferences of different users, and at the same time should be able to understand and represent the task structure where the achievement of multiple independent subgoals may enable a final outcome. This in turn requires an ability to connect temporally nonadjacent subtasks. Very importantly, since multiple demonstrations of the same task would be time-consuming and annoying for users, the acquisition of generalized task knowledge in very few demonstrations is crucial for user acceptance.

In this paper we present a neurodynamics approach to robot task learning that takes inspiration from a hypothesis about how the nervous system might efficiently consolidate initial memory traces of sequential events into structured knowledge. Converging lines of evidence from computational theories (e.g., connectionist networks McClelland, O’reilly, & McNaughton, 1995, for review see O’Reilly & Norman, 2002) and neurophysiological studies (Euston, Tatsuno, & McNaughton, 2007, for review see Sutherland & McNaughton, 2000) support the notion of two complementary learning systems that allow distributed neural structures to gradually integrate new input patterns without disturbing previously stored information. A fast system is responsible for the rapid storage of newly demonstrated sequential events. The spontaneous re-activation of these memory traces during “off-line states”, in which the neural system is not processing external inputs, facilitates the gradual adjustment of synaptic weights in associative networks of the slow system encoding generalized knowledge of the sequential structure. To model the two complementary learning systems, we apply the theoretical framework of dynamic neural fields (DNFs) that has been proven in the past to provide key processing mechanisms for applications in cognitive modeling (Schöner, 2008) and in cognitive robotics (Erlhagen & Bicho, 2006). Most importantly, DNFs explain the existence of self-sustained activity in neural populations as the result of reciprocal positive feedback between neighboring neurons (Amari, 1977). Persistent activity has been reported in many areas of higher association cortices and is commonly believed to support a multitude of relevant cognitive functions such as working memory, decision making, and the learning of associations between events separated in time (Curtis and Lee, 2010, Miller, 2000). The intrinsically stable dynamics of population activity modeled by DNFs allows us not only to implement the short-term maintenance of task-relevant sequential information but also the active rehearsal of this information during off-line learning periods not constrained by the time course of observed events.

To test the learning model in real-world robotics experiments, we adopt a construction task that we have used in previous work to test a DNF architecture for natural and fluent human–robot interactions (HRI) (Bicho et al., 2011, Bicho et al., 2010). The main goal of the present study is to acquire in a social learning situation with human tutors the knowledge about possible serial orders of executing the assembly steps. This shared task knowledge was predefined by the designer in our earlier HRI studies. To this end, one or more human tutors first show the humanoid robot ARoS the assembly work consisting of a series of assembly steps necessary to construct a toy object from its parts. The tutor then provides immediate verbal feedback about predicted next steps when the robot tries to reproduce the serial order from memory. Demonstrations with different serial orders together with the integration of prediction errors in the associative learning process allow the robot to acquire generalized task knowledge in very few social learning interactions. This success is shown in a joint action scenario in which ARoS uses the newly acquired assembly plan to construct the toy together with a human partner.

The rest of the paper is structured as follows: Section 2 describes the construction task, the learning paradigm and the robotic platform ARoS. Section 3 presents an overview about basic processing principles of the dynamic neural field framework. Section 4 contains the description of the DNF based learning model. Experimental results are presented in Section 5. The paper finishes with a discussion of results and future work in Section 6.

Section snippets

Task description

By observing a human tutor, the robot has to learn the sequential structure of individual assembly steps necessary to construct a toy vehicle from its components (Fig. 1). The vehicle has a round platform with an axle as its base (BA). On each side, a wheel (RW and LW) has to be attached to the axle and subsequently fixed with a nut (RN and LN, respectively). Subsequently, 4 columns (GC, BC, RC, MC) identified by their different colors (green, blue, red and magenta, respectively) have to be

The Dynamic Neural Field framework

Dynamic neural fields (DNFs) represent a theoretical framework for developing cognitive control architectures that is consistent with fundamental principles of cortical information processing in distributed networks of connected neuronal populations (Erlhagen and Bicho, 2006, Schöner, 2008). Task-relevant information is represented by supra-threshold activity patterns (or bumps) of neural populations. These patterns are initially triggered by a transient input from external sources

DNF model of task learning

Fig. 5 presents a schematic view of the model architecture with two interconnected modules. It reflects the idea of two complementary learning systems that has been proposed as a solution for natural and artificial learning systems to avoid the potentially “catastrophic” interference of old and new memories (McClelland et al., 1995). Converging lines of neurophysiological evidence suggest that Prefrontal Cortex–Hippocampus interactions might constitute a neural substrate for this hypothesis.

Results

We report results of three learning experiments in which the robot ARoS observed different tutors executing the assembly sequence with varying temporal orders. In the last experiment, an error occurred and was eventually corrected following additional demonstrations. Each experiment consists of observation, off-line rehearsal and execution phases. During observation, the human has all parts within reach to perform the whole sequence. During execution, objects are distributed on the table, and

Discussion and future work

For decades, the idea has been discussed in animal and human research that off-line periods after learning such as awake resting or sleep promote the gradual incorporation of newly acquired information into long-term memory representations. Off-line improvements without further practice have been reported in a wide variety of perceptual, motor and complex cognitive tasks (Stickgold, 2005). The mental replay of memory traces of recent behavioral experiences has been hypothesized to play an

Acknowledgments

The work was funded by FCT - Fundação para a Ciência e Tecnologia, through the PhD Grants SFRH/BD/48529/ 2008 and SFRH/BD/41179/2007 and Project NETT: Neural Engineering Transformative Technologies, EU-FP7 ITN (nr.289146) and the FCT-Research Center CMAT (PEst-OE/MAT/UI0013/ 2014).

References (48)

E. Bicho et al.
Neuro-cognitive mechanisms of decision making in joint action: a human–robot interaction study
Human Movement Science
(2011)
C.E. Curtis et al.
Beyond working memory: The role of persistent activity in decision making
Trends in Cognitive Sciences
(2010)
J.L. Elman
Finding structure in time
Cognitive Science
(1990)
S. Grossberg
Behavioral contrast in short term memory: Serial binary memory models or parallel continuous memory models?
Journal of Mathematical Psychology
(1978)
R.B. Ivry et al.
The neural representation of time
Current Opinion in Neurobiology
(2004)
H. Levesque et al.
Cognitive robotics
R.C. O’Reilly et al.
Hippocampal and neocortical contributions to memory: Advances in the complementary learning systems framework
Trends in Cognitive Sciences
(2002)
A.R. Preston et al.
Interplay of hippocampus and prefrontal cortex in memory
Current Biology
(2013)
B.J. Rhodes et al.
Learning and production of movement sequences: Behavioral, neurophysiological, and modeling perspectives
Human Movement Science
(2004)
Y. Sandamirskaya et al.
An embodied account of serial order: How instabilities drive sequence generation
Neural Networks
(2010)

A.R. Seitz et al.

A common framework for perceptual learning

Current opinion in neurobiology

(2007)

G.R. Sutherland et al.

Memory trace reactivation in hippocampal and neocortical neuronal ensembles

Current Opinion in Neurobiology

(2000)

A.L. Thomaz et al.

Teachable robots: Understanding human teaching behavior to build more effective robot learners

Artificial Intelligence

(2008)

S.-i. Amari

Dynamic of pattern formation in lateral inhibition type neural fields

Biological Cybernetics

(1977)

A. Bandura

Social learning theory

(1971)

E. Bicho et al.

Integrating verbal and nonverbal communication in a dynamic neural field architecture for human–robot interaction

Frontiers in Neurorobotics

(2010)

A.G. Billard et al.

Robot programming by demonstration

A. Cleeremans et al.

Learning the structure of event sequences

Journal of Experimental Psychology. General

(1991)

S. Coombes et al.

Exotic dynamics in a firing rate model of neural tissue with threshold accommodation

K. Dautenhahn et al.

Imitation in animals and artifacts

(2002)

W. Erlhagen et al.

The dynamic neural field approach to cognitive robotics

Journal of Neural Engineering

(2006)

W. Erlhagen et al.

Dynamic field theory of movement preparation

Psychological Review

(2002)

D.R. Euston et al.

Fast-forward playback of recent memory sequences in prefrontal cortex during sleep

Science

(2007)

F. Ferreira et al.

Learning a musical sequence by observation: A robotics implementation of a dynamic neural field model

Cited by (15)

Brain-inspired multiple-target tracking using Dynamic Neural Fields
2022, Neural Networks
Citation Excerpt :
It has been widely employed in the past to model cognitive functions including visual attention, single object tracking and motion extrapolation, working memory or the learning of object pose and identity (Erlhagen, 2003; Erlhagen & Jancke, 2004; Fix, Rougier, & Alexandre, 2010; Jenkins, Samuelson, Penny, & Spencer, 2021; Lomp, Faubel, & Schöner, 2017). The theory is also used in cognitive robotics for decision making, action understanding and observational task learning (Erlhagen & Bicho, 2006; Sousa, Erlhagen, Ferreira, & Bicho, 2015; Wojtak et al., 2021). Several hardware implementation studies on this theory have used neuromorphic approaches (Martel & Sandamirskaya, 2016) and FPGA (De Vangel, Torres-Huitzil, & Girau, 2015, 2016).
Despite considerable progress in the field of automatic multi-target tracking, several problems such as data association remained challenging. On the other hand, cognitive studies have reported that humans can robustly track several objects simultaneously. Such circumstances happen regularly in daily life, and humans have evolved to handle the associated problems. Accordingly, using brain-inspired processing principles may contribute to significantly increase the performance of automatic systems able to follow the trajectories of multiple objects. In this paper, we propose a multiple-object tracking algorithm based on dynamic neural field theory which has been proven to provide neuro-plausible processing mechanisms for cognitive functions of the brain. We define several input neural fields responsible for representing previous location and orientation information as well as instantaneous linear and angular speed of the objects in successive video frames. Image processing techniques are applied to extract the critical object features including target location and orientation. Two prediction fields anticipate the objects’ locations and orientations in the upcoming frame after receiving excitatory and inhibitory inputs from the input fields in a feed-forward architecture. This information is used in the data association and labeling process. We tested the proposed algorithm on a zebrafish larvae segmentation and tracking dataset and an ant-tracking dataset containing non-rigid objects with spiky movements and frequently occurring occlusions. The results showed a significant improvement in tracking metrics compared to state-of-the-art algorithms.
Neurobiologically Inspired Robotics: Enhanced Autonomy through Neuromorphic Cognition
2015, Neural Networks
Rapid Learning of Complex Sequences with Time Constraints: A Dynamic Neural Field Model
2021, IEEE Transactions on Cognitive and Developmental Systems
A neural integrator model for planning and value-based decision making of a robotics assistant
2021, Neural Computing and Applications
A Human-like Upper-limb Motion Planner: Generating naturalistic movements for humanoid robots
2021, International Journal of Advanced Robotic Systems
Towards collaborative robots as intelligent co-workers in human-robot joint tasks: What to do and who does it?
2020, 52nd International Symposium on Robotics, ISR 2020

View all citing articles on Scopus

View full text

2015 Special IssueOff-line simulation inspires insight: A neurodynamics approach to efficient robot task learning

Abstract

Introduction

Section snippets

Task description

The Dynamic Neural Field framework

DNF model of task learning

Results

Discussion and future work

Acknowledgments

Human Movement Science

Trends in Cognitive Sciences

Cognitive Science

Journal of Mathematical Psychology

Current Opinion in Neurobiology

Trends in Cognitive Sciences

Current Biology

Human Movement Science

Neural Networks

Current opinion in neurobiology

Current Opinion in Neurobiology

Artificial Intelligence

Dynamic of pattern formation in lateral inhibition type neural fields

Biological Cybernetics

Social learning theory

Integrating verbal and nonverbal communication in a dynamic neural field architecture for human–robot interaction

Frontiers in Neurorobotics

Robot programming by demonstration

Learning the structure of event sequences

Journal of Experimental Psychology. General

Exotic dynamics in a firing rate model of neural tissue with threshold accommodation

Imitation in animals and artifacts

The dynamic neural field approach to cognitive robotics

Journal of Neural Engineering

Dynamic field theory of movement preparation

Psychological Review

Fast-forward playback of recent memory sequences in prefrontal cortex during sleep

Science

Learning a musical sequence by observation: A robotics implementation of a dynamic neural field model

2015 Special Issue
Off-line simulation inspires insight: A neurodynamics approach to efficient robot task learning