1 Introduction

Patients who have suffered impairment of their neuromotor abilities due to a disease or accident have to relearn to control their bodies. For example, after stroke the ability to coordinate the movements of the upper limb in order to reach and grasp an object could be severely damaged. Or in the case of amputees, the functional ability is completely lost. Early rehabilitation interventions are aimed to help patients reduce the impairment’s impact on their lives, and help them recover in a way that allows them to regain some ability and independence during activities of daily living. It is highly recognized that a rehabilitation intervention should be well-guided, well-focused, and repetitive. This is in a way the same kind of strategy used when learning a new skill, such as playing an instrument or a sport.

This is why it is important for people working in rehabilitation to study and understand the mechanisms of human motor control and learning. This knowledge have helped to shape current rehabilitation methods, as well as to develop technologies and machines that assist impaired people to improve their quality of life and integrate faster into society.

Understanding motor control and learning even for a simple movement (e.g., reaching for a glass) is a very big endeavor due to the many variables that comes into play. David Marr [80] identified three levels of abstraction that could be used for the study of motor control [97]. Certainly, these levels are not independent from each other, but they could used to organize all the vast amount of details that have to be taken into account in motor control and learning.

The first level is the computational theory level, which relates to an abstract description of what the systems is supposed to achieve and why, and that results in operations defined only by the constrains that need to be satisfied. Therefore, this level could be regarded as a mathematical formulation of the movement plan, and it should take into account the different variables, restrictions, difficulties, and outcomes that would arise when the movement has started. The second level is the algorithmic level, which describes the behavioral and cognitive states that are used in real time during the movement. In order words, it specifies how the first level would be accomplished. There are often several possible algorithms that can be used to achieve a desired movement and the choice will depend on the characteristics of each algorithm. For example, a person could decide to grasp a cup from different angles (e.g., from the top or from the side) depending on his initial posture or the state of the cup (e.g., fill or empty). The algorithm could even change once the movement has started due to novel states (e.g., external perturbation or obstacles). The final level is the implementation level, which describes how algorithms are physically implemented (e.g., by contracting muscles or using a prosthetic hand).

In this chapter, we would like to present a general overview of the computation and the algorithmic levels by discussing the most relevant motor control and learning theories that have been put forward in recent years. Furthermore, we would like to discuss how these theories are currently being applied on rehabilitation technologies.

The chapter is divided into two different parts. The first will provide a general overview of relevant and recent theories of motor control and learning. On the second part of the chapter, we will describe two practical applications that make use of this theories in order to improve the control and development of neuroprostheses and hand prostheses rehabilitation.

2 Theories of Motor Control

Imagine that in a sunny day you are thirsty, and you decide to drink a glass of water. Although you are able to perform this action relatively easily, a variety of challenges have to be takled in order to accomplish the required task. Which muscle contractions will allow you to reach the glass, grasp it and bring it to the mouth, eventually satisfying your thirst? How does your central nervous system (CNS) compute this solution? These and other similar questions have been arousing scientists’ curiosity for centuries, who have proposed several theories to explain the control of coordinated movements. This section provides a general overview of these works.

There are several issues that make the generation of movements a very difficult problem. First of all, the musculoskeletal system is inherently redundant. Each point in space can be reached with many different joint configurations. Similarly, joint torques can be obtained by an infinity of muscle forces, which in turn can be generated by several muscle activation patterns. Redundancy allows us to perform motor tasks flexibly and robustly; however, it rises the question on how motor commands are selected. The question of how the CNS “chooses” among all the possible solutions to a motor task is a long standing riddle in motor neuroscience, referred to as the “redundancy problem” or “Bernstein problem” [11]. Second, our sensing and motor systems are corrupted by noise [30]. This feature, along with the unpredictability of the environment, add uncertainty to our perception of the world and to the result of our actions. Moreover, neural pathways introduce delays. Hence, sensory information carries past information, and motor command will be executed in the future. How does the CNS account for these delays in order to, for example, react to a sudden change of the world? Finally, the nonlinearities of the neuromusculoskeletal system have to be taken into account for effective motor planing and execution.

2.1 Optimal Control

Theoretically, redundancy enables to perform the same action in very different ways. Yet experimental observations have shown that individuals seem to employ the same strategy to solve a given task, i.e., movement features are shared across subjects. As an example, during simple point-to-point reaching movements hand trajectories appear consistently straight and characterized by bell-shaped velocity profiles, independently of movement direction and amplitude [84]. The fundamental principle underlying this phenomenon is unknown; however, a largely accepted idea in the scientific community is that movements are selected because they optimize certain aspects of behavior [115]. This view allows scientists to explain similarity across subjects in terms of fundamental principles, but it poses the challenge of identifying the behavioral aspect that is actually optimized. The main criticism to this approach is indeed that it might always be possible to find an optimization criteria that explains the behavior at hand.

To interpret motor skills in terms of optimization principles, it is necessary to model the body, and to propose a cost function to be minimized. The model represents the evolution of the body variables (e.g., joint angles, end-effector position, muscle kinematic) as a function of the state and the motor commands (e.g., joint torques, muscle force, muscle activation); the cost function formalizes the behavioral aspect that is hypothetically minimized. The idea is to find the control policy (i.e., a mapping between time or state of the body to motor commands) that leads to a successful completion of a desired motor task, and that minimizes the cost function. If simulations of the model under the computed control policy approximate the movements experimentally observed, then it is suggested that the CNS selects such movements because they minimize the proposed cost.

A variety of cost functions have been proposed as models of movements selection processes. Initially, to explain the kinematic regularities observed by Morasso [84], Flash and Hogan [38] theorized the so-called minimum jerk model, where the square of the third derivative of the end-effector position (i.e., jerk) is minimized, obtaining straight trajectories and bell-shaped velocity profiles. Later, scientists have started to focus on dynamical aspects of the motor system. Uno et al. [119] proposed that the rate of change (with respect to time) of joint torque was minimized instead of the third derivative of end-effector position. Since these variables are related by an nonlinear mapping, these cost functions render different solutions. More recently different research groups formalized a model based on minimizing the squared motor commands [25, 60, 116]. This measure does not reflect energy consumption, instead it should be viewed as an abstract notion of “effort” [50]. One of the criticisms to all these models has been that they do not include the inherent noise of the sensory-motor systems. A fundamental characteristic of motor noise is that its standard deviation scales with the amplitude of motor commands, i.e., signal-dependent noise [62]. To take this observation into account, Harris and Wolpert [53] proposed that the process of motor planning minimizes endpoint variance, hence maximizing movement accuracy. The minimum endpoint variance model was able to predict eye and arm movements [53, 54]; however, it was less accurate than the minimum effort model in predicting the distribution of forces generated by each finger to produce a total desired force goal [89].

2.2 Optimal Feedback Control

The optimality models described so far provide control policies that do not react to execution errors. In mathematical terms, they are functions of time only (i.e., feedforward). On the other hands, we are able to adjust our movements at the occurrence of unexpected events. One possibility to address this limitation could be to introduce a fast feedback loop that, upon disturbances, tries to push the state of the system to a previously planned desired trajectory. However, this strategy is very inflexible (as it assumes a single strategy to solve a motor task, i.e., the planned trajectory), and it might lead to suboptimal solutions (for example, it could increase the effort to solve a task). A less trivial possibility is provided by the framework of optimal feedback control [12, 103, 116]. An optimal feedback control law is a policy that minimizes a given cost function, and specifies the optimal motor command for each state of the body and time of execution. In the field of motor neuroscience, many researchers have proposed a cost function that takes into account task accuracy and effort [116]. As a results, the controller reacts to perturbations that are task-relevant and ignores deviations of task-unrelated variables, as opposed to following a preplanned trajectory. This strategy is also referred to as minimum intervention principle [75]. These predictions are confirmed by several experimental observations [25, 40, 49]. However, optimal control theories have recently been challenged by novel results that suggest a habitual rather than optimal execution of motor tasks [24, 77].

2.3 Internal Models

One of the assumptions of optimal feedback control models is that precise sensory information is instantly available. This assumption is unrealistic, because noise as well as delays affect the sensory-motor system. To overcome this issue, it has been proposed that the CNS computes online estimates of the current sensory information by taking into account previous motor commands and (out-of-date) sensory readings. This is hypothetically achieved by means of two mechanisms: forward models and sensory integration.

Forward models are computational entities that instantiate models of the neuromuscular system and the environment, and predict the sensory consequences of motor commands. The brain could then take decisions based on such predictions (hence used as fast feedback loops) without having to wait for the actual delayed sensory readings. The idea that the CNS employs such a mechanism has been initially proposed to explain how the brain corrects movements that are executed so quickly that sensory feedback cannot be used, i.e., saccadic eye movements [106]. Recently, Xu-Wilson et al. [126] have shown that, unlike healthy subjects, patients with cerebellar damages cannot compensate the variability of motor commands in saccades. Since there is an evidence that forward models are implemented in the cerebellum [90], this results suggest that healthy people employ forward models to predict the consequences of saccadic motor commands and readily correct predicted errors.

Internal models have been investigated also in movements affected by long-latency sensory feedback. To this end, the classical experimental paradigm consisted in applying force disturbances by means of a robotic device [105]. The authors of these studies observed that after an adaptation period, subjects were able to compensate for the applied disturbance, and concluded that internal models were continuously updated in order to predict the sensory consequences of the motor commands in the altered environment (see Sect. 3). Flanagan et al. [36] arrived to similar conclusions by showing that subjects were able to predict grip forces during object manipulation [35, 37]. Ariff et al. [5] observed that saccades anticipated the final position of the hand during reaching movements without visual feedback in healthy people. If the arm movement was perturbed by an external force field (i.e., changing the dynamics of the environment), subjects were initially not able to predict the final hand location, but they regained this capability after learning. The authors of this work concluded that saccades rely on predictions, and that the internal forward model can be adapted to account for changes in the environment [88]. Miall et al. [83] showed that perturbations on the cerebellum by transcranial magnetic stimulation (TMS) pulses during arm movements led to delayed estimate of the position of the limb.

The elements required by the CNS to compute forward sensory predictions are the motor commands and an estimate of the current body state. It is hypothesized that the CNS keeps a copy of the descending motor signals, called efference copy. Estimating the current body state involves the integration of various sensory modalities, which might carry different noisy information. This process has been explained with the framework of Bayesian integration. In this context, previous sensory predictions are used to compute a prior probability of the current state, which is then combined to the actual sensory information, obtaining a posterior probability. The latter represents the current belief about state of the body and the environment. These ideas have been tested experimentally by assessing the capability of human subjects to estimate positions [67], forces [66], and velocities [112] using sensory information currupted by noise.

Optimal feedback control, internal models, and Bayesian integration can been assembled in a unified computational framework, depicted in Fig. 1, that arguably represents the most comprehensive view of motor control [104].

Fig. 1
figure 1

A unified view of motor control theories. Motor commands are generated according to an optimal feedback control policy, which embeds the requirements of the task. Body and environment react to these commands, and move to a different state. The sensory system measures the new state but, due to time delays in the neural pathways, it provides “out-of-date” measurements. Optimal feedback control, however, needs updated, rather than delayed, feedback information in order to generate optimal motor commands. Such an updated information is provided by a fast-state estimator, which integrates the sensory measurements (that possibly arrive from a variety of sensory streams) with a prediction of the sensory consequences of motor commands; hypothetically, this integration is performed according to the Bayessian framework. Predicted sensory consequences are generated by a forward model. This scheme has been adapted with permission from Shadmehr and Krakauer [104]

2.4 Equilibrium Point Hypothesis

An alternative to the models discussed so far is the so-called equilibrium point hypothesis (EPH) [32, 33]. This hypothesis assumes that the CNS controls body parameters rather than variables directly related to the task, and that movements emerge from the physical interaction between the appropriately tuned body dynamics and the environment. In particular, it is hypothesized that descending motor commands adjust parameters of the tonic stretch reflex in order to produce a desired equilibrium point of the limb [31]. Thus, the EPH exemplifies the main idea of the dynamical pattern theory of motor control, i.e., movements are emergent properties [102, 114].

To understand the EPH it is necessary to spend a few words on its key ingredient, the tonic stretch reflex. This is defined as a sustained muscle contraction in response to slow stretching [72]. When a muscle is slowly stretched by an external load, initially it produces an opposing force due to its passive elastic properties. If the muscle length overcomes a certain threshold, the subsequent activity of muscle spindles leads to the recruitment of a group of motor neurons, which causes the muscle to contract producing an active force that opposes the stretch. This force increases nonlinearly with the amount of stretch. For a given constant load, the muscle stabilizes at a given length called equilibrium point.

In the context of the EPH, the position of a limb results from the equilibrium points of the muscles around its joints. In order to generate voluntary movements, the brain sends descending commands that modify the threshold of the tonic stretch reflex arcs. As a result, new equilibrium positions are defined, and the limb moves accordingly. This idea has a few implications that are worth discussing. First, muscle activation is not directly controlled by descending motor commands, rather it results from the tonic stretch reflex. In other words, for a constant motor command (which defines the threshold of the reflex), different muscle activations as well as limb positions can be obtained depending on the external load. Second, there is no need to estimate the body state to compute appropriate motor commands. Indeed under an assumption of stability, the body will move toward the equilibrium point independently on its initial condition.

The problem of motor coordination is not solved by the EPH. A great number of variables, in this case the parameters of the stretch reflexes across muscles, should be coordinated in order to accomplish the desired task. How does the CNS solve such a redundancy? To this end, the uncontrolled manifold hypothesis (UMH) has been suggested as a general principle of coordination that could be applied at any level of details of the CNS. The idea is that the controller tries to keep the values of a group of task-related “elemental variables” (e.g., joint angles, muscles forces, muscle activations, thresholds of tonic stretch reflexes), named structural unit or synergy, within a subspace corresponding to successful task achievement (the uncontrolled manifold). Thus, the controller does not specify a single task solution (as in the case of optimal control), rather it facilitate variability within the uncontrolled manifold. In principle, this is the same behavior of an optimal feedback controller, which only reacts to deviations on task-related dimensions (see Sect. 2.2). The UMH and the notion of structural units have been used to explain postural control [42, 71, 74, 123] and manipulation [22, 73, 108, 127].

Usually, scientists who support the EPH are rather skeptical about the idea that the CNS learns and use internal models. Instead, they are more inclined to think that no heavy computations are performed, and that movements emerge from the interaction between body and environment. As a matter of fact, however, there is the need for the CNS to compute how to modify the parameters of the tonic stretch reflex in order to accomplish a desired task. Thus, a mapping between motor commands (i.e., reflex thresholds) and output variables (i.e., an internal model) might still be needed.

3 Motor Learning

Humans show a remarkable capacity to learn a variety of motor skills, whether it is adapting to changes in our environment, acquiring new skills, or improving existing skills. A lot of progress has been made on motor learning over the last few decades; however, researchers have a fair understanding of motor learning only of a narrow range of tasks, including simple reaching task in which different types of perturbations are applied. One of the exciting challenges ahead includes bridging the knowledge on simple movements to ‘real-world’ motor learning, and translating this knowledge to neurorehabilitation paradigms.

Motor learning is a broadly defined term referring to improvement in motor performance through practice [69]. It is believed that motor learning consists of multiple processes, of which motor adaptation and skill acquisition are considered to be the main processes in the literature [64, 69]. Motor adaptation is commonly defined as the response of the motor system to perturbations, such as changes in the environment, to regain a former level of performance in the new, changed environment [106]. Skill acquisition is considered to be a process in which task performance is improved beyond the baseline, mostly in the absence of perturbations. Researchers posit that skill acquisition is manifested by reduced motor variability and achieving higher levels of performance without a reduction of speed [69, 94, 109].

The goal of this section is to provide an overview on motor learning. Note that excellent reviews are already available describing the substantial progress of our understanding of the mechanisms of motor learning over the last decades (e.g., see Refs. [69, 106, 125]). Here, we give a short overview of the most important aspects of these mechanisms as a background for the other sections and chapters of this book.

3.1 Motor Adaptation

Motor adaptation has been investigated extensively using error-based learning paradigms, such as visuomotor rotations or force fields [105, 106]. In these paradigms, participants experience a perturbation resulting in a discrepancy between the predicted and executed hand trajectories; for instance, due to a perturbation in visual information (visuomotor rotations), or to perturbing forces (force field paradigms) [69, 106], see Fig. 2 for an short description of a visuomotor learning paradigm. Adaptation is the process that reduces the systematic error induced by the perturbation, and it is believed to occur through trial-by-trial adjustments of an internal model (the forward model) that maps motor commands onto predicted sensory outcomes. By doing so, error-based learning keeps movements well calibrated and correct for systematic biases [106].

Fig. 2
figure 2

A visuomotor rotation is a commonly used error-based learning paradigm. a Participants are asked to make movements with their hand so that a cursor moved from a starting position to a target. In the baseline condition, hand and cursor movement are congruent. In the adaptation phase, a visual rotation is imposed (45 degrees counterclockwise in this case) on the cursor movement; e.g., when moving the hand straight forward, the cursor would move at an angle. Studies have shown that participants gradually learn to move their hand in a way that compensates for the rotation, such that the cursor moves to the target again. b This figures shows a typical adaptation curve. When the rotation is introduced, the error at the end of the reaching movement initially is large, followed by a gradual decline of endpoint errors with increasing number of movements. At some point, the movement error is similar to the baseline, indicating that the participant is adapted to the visual rotation. Adapted from [82, 106]

3.1.1 Error as a Learning Signal

The learning signal driving adaptation in error-based learning is, as the name implies, the error signal between a desired and actual action, as well as the particular way the desired action was missed [106, 125]. The error signal is believed to adapt the motor commands, such that the error decreases in consecutive movements [106]. Wolpert and colleagues reported that in order to adapt to perturbations, the nervous system also estimates the gradient of the error with respect to each motor command component [125]. This means that the motor system needs to have an idea of how components of the motor command attribute to the error, and subsequently how the motor system can reduce the error. Wei and Körding posited that the sensorimotor system might adapt to errors in a nonlinear fashion [124]. They suggested that the sensorimotor system must weigh the information, in this case the error, provided by the uncertainty the information has in the signal. The ideal strategy, they argue, is therefore nonlinear, where small errors are compensated in a linear fashion and large errors would be disregarded. Errors that fall within the expected variance will be adapted for in a fairly linear way, whereas participants showed nonlinear and nonspecific adaptation to single trials containing error signals that exceeded expectation [41, 124].

3.1.2 Different Processes of Motor Adaptation

Temporal processes Smith and colleagues [111] proposed a model in which two parallel temporal processes drive motor adaptation: (1) a fast-acting process that learns and forgets quickly and (2) a slow-acting processes that learns and forgets more slowly. This model is able to explain complex features of motor learning such as spontaneous recovery of learning, savings (relearning of a perturbation or skill is faster than the initial learning), anterograde learning (the ability of a previously learned force field task to reduce the learning rate of a different subsequent task) and even patterns of 24-hour retention [61, 110]. More recent studies suggested that additional learning processes also need to be present to fully explain the temporal evolution of motor adaptation. Lee and Schweighofer [76] proposed a model with a single fast process combined with multiple slow processes, that could explain different types of adaptation tasks. An advantage of such a multi-rate learning model is that it can account for different temporal changes of the sensorimotor system, such as fatigue or injury [76, 125].

Model-based and model-free processes It is likely that multiple processes occur during motor learning, which are often classified as model-based learning processes (e.g., adaptation of the internal model) or model-free learning processes (e.g., use-dependent plasticity and reinforcement learning). For instance, studies have shown that several (model-free) processes occur besides error-based learning (adaptation): use-dependent plasticity [28, 56, 121] and reinforcement learning [56].

It has been shown that repeating a movement in a particular direction does not only reduce movement variability, but also creates a bias toward that direction in future movements [121]. This repetition-induced bias has been termed as use-dependent plasticity [69]. A couple of studies showed that when performing a reaching task in a perturbed environment, adaptation and use-dependent plasticity occur simultaneously [28, 56]. Huang and colleagues used a modified visuomotor rotation paradigm to show that participants, when adapting to the visuomotor rotation, create a bias toward the adapted movement direction [56].

In addition, Huang and colleagues [56] hypothesized that, during a visuomotor rotation adaptation task, hitting a target is a form of implicit reward driving a reinforcement process whereby successful error reduction is associated with the motor commands. They also showed that the model-free reinforcement learning process is independent of model-based learning (adaptation). Combining the model-based adaptation process with the reinforcement process leads to faster relearning (i.e., savings).

3.1.3 Structural Learning

Structural learning is a framework to explain the learning-to-learn phenomenom [14, 15]. Structural learning can be considered as learning certain features of a learning task, such that learning of similar tasks is facilitated. Braun and colleagues found support for structural learning by having participants perform reaching movements, during which random visuomotor rotations were imposed. The participants then adapted to a constant visuomotor rotation. They found that being exposed to the random visuomotor rotations facilitated learning in the constant rotation [14]. Braun et al. suggested that training with the random rotations allowed the participants to extract relevant features, or structures, of the task; all tasks were rotations. Structural learning is also consistent within the Bayesian framework, in that it would correspond to learning new prior distributions on the parameters of the perturbation [9, 10, 34].

3.1.4 Neural Correlates of Adaptation

Although the notions of different learning processes are intriguing, it is still not completely known how the brain performs all these hypothesized actions. Evidence suggests that the cerebellum plays an important role in trial-by-trial error-based learning [8, 26, 29, 117]. More specifically, some studies posit that the cerebellum computes the prediction error-driving adaptation [99, 113]. Patients with cerebellar lesions showed substantial impairment in fast adaptation across different tasks [26, 117]. Brain stimulation studies found that enhanced cerebral activity using transcranial direct current stimulation resulted in faster adaptation [44, 47, 100]. Where different types of adaptation are neurally stored remains an open question [125].

3.2 Skill Learning

Whereas in error-based learning, the motor system aims to reduce the error to zero, it does not systematically improve performance beyond baseline, a feature that is considered to be crucial in skill acquisition [82, 94, 109, 125]. Unlike adaptation, skill acquisition is studied for tasks where often no perturbation is present. Although different learning processes, such as reinforcement learning, are likely to play important roles in skill acquisition, they are not as well understood compared to the mechanisms underlying error-based learning.

3.2.1 Reinforcement Learning

To achieve an increase in performance, such as a reduction in error variability, reinforcement learning can help to find a solution to a movement problem. Reinforcement learning is driven by a reward signal; for instance, the information about the relative success and failure of a movement [41, 125]. In contrast to the error signal in error-based learning, a reward signal does not give information about the direction of required behavioral change [125]. Therefore, reinforcement learning tends to be slower than error-based adaptation. However, when a complex sequence of actions is necessary to achieve a goal, reinforcement learning can be used to explain what actions led to success and which led to failure, whereas error-based learning might be less successful.

3.2.2 Speed-Accuracy Trade-Off

Recent research has defined skill acquisition as a shift in the speed-accuracy trade-off function (SAF) [94, 109]. Reis and colleagues argue that defining skill acquisition as a shift in SAF is necessary, otherwise it is not clear how to relate changes in speed and accuracy to a change in skill. For instance, one could reduce execution speed and obtain a higher accuracy by “moving” along the same SAF, which would not reflect a change in skill.

Furthermore, Shmuelof et al. posit that a crucial concept regarding skilled performance is that successful execution and the trajectory kinematics associated with this execution are distinct. This is the case because only the task success is explicitly required, whereas there may be multiple kinematics that reach the desired goal [109]. In an experiment where subjects were instructed to follow a curved path without perturbation using wrist motions, the authors examined changes in the SAF and trajectory kinematics during learning. They found that practicing in restricted speeds led to a global shift of the SAF. Improved performance largely resulted from reduced trial-to-trial variability and increased movement smoothness. The authors propose that motor skill acquisition can be characterized as a slow reduction in movement variability, which is consistent with previous studies [85, 86] but distinct from faster model-based learning, which reduces error in adaptation paradigms.

3.2.3 Skill Learning and Optimality

Optimal feedback control (OFC), as described in Sect. 2.2, could be used to study skill learning [27, 69]. Although OFC has not been used to describe the learning process itself yet, it has been used to explain how we learn to control complex objects with internal degrees of freedom [87], see Fig. 3. For these tasks, there is no simple one-to-one mapping from the hand state to the state of the object (i.e., there are uncontrolled degrees of freedom). During training, participants interacted with the objects and showed improvements in meeting an accuracy criterion even though they had to move faster (i.e., shift in SAF, which is considered to be an improvement in skill). The hand kinematics after training could be described by OFC using a relatively simple cost function. The authors assumed that during training, the participants adapted to the complex dynamics in accordance with a model-based optimization of the cost function [87]. One could speculate that only the model-based optimization part would lead to skill acquisition; however, since the training was not the focus of Nagengast’s study, insufficient data were available. Krakauer and Mazzoni suggested that two processes could occur during training, leading to better performance: convergence to the optimal policy, or improved execution of the control itself. Either of these processes could lead to a shift in SAF [69] and to reductions in movement variability [85, 86].

Fig. 3
figure 3

Optimal feedback control could be used to study motor skill learning. as Schematic representation of the task. Participants were asked to move both their hand and and the object from a start position to a target within a prescribed time window. The hand and object were connected through the complex dynamics of a mass-damper-spring system. b The recorded hand trajectory (blue dashed line) and simulated hand trajectory (using OFC) are shown for a particular mass-damper-spring system. Note that a relatively complex hand trajectory was necessary to move the hand and object to the target. Nagengast et al. concluded that the simulated hand trajectory fits the measured hand trajectory well. c The measured object trajectory and simulated object trajectory describe a relatively straight line from the start to the target. As mentioned before, the simulated object trajectory described the measured object trajectory well. Adapted from [87]

4 Application to Upper Limb Prosthesis Users

“Neurorehabilitation is based on the assumption that motor learning principles can be applied to motor recovery after injury, and that training can lead to permanent improvements in motor function in patients with motor deficits”. Considering this statement of [63], the reconstruction of upper limb prosthesis user joint functions appears as a special case of neurorehabilitation.

Amputees have quite different medical history than, for example, stroke survivors. This is because prosthesis users have either lost one or more joints due to an accident, or they have already had received a surgery for reconstruction that unfortunately ended up in an amputation. Besides pain and physiological problems, prosthesis users become substantially influenced by psychological factors, such as (i) learning ability, (ii) cognitive skills, (iii) motor skills, and (iv) mental status (e.g., motivation, will, stress), which are situated in their mental–body. Thus, a prosthesis user needs time for adaptation and reorganization of the neuronal network to the new setup. It seems that they feel and imagine their original joints and they can also move them, a phenomenon called phantom limb [91], and that they can even feel phantom-limb pain [39, 101].

Reasons for amputation can be different; however, all amputees have to struggle with the new situation: some structure of their limbs is no longer present, but their synaptic input connections to the brain, say the neural network, is still present. Some afferent connections are lost, where the synapses are then somehow floating, say they are simply left open; and some efferent connections (i.e., axons from neurons that formerly have had controlled muscles of the lost joints) also end up. Patients have been able to perform mental finger motions right after amputation and after several years they are still capable of controlling their forearm muscles. This has been attributed as evidence to brain plasticity and reorganization [91].

Hence, exploiting the phantom-limb phenomenon could enable more intuitive prosthesis control to users. They may simply try to move the phantom-limb joints as if they used their original joints. In particular, contractions of residual muscles of the stump can be captured by means of surface EMG electrodes, and they can be used for the control of the prosthesis.

4.1 Prostheses of Today

Standard applications of prosthesis control use two EMG electrodes, one on the flexors’ side and one on the extensors’ side of the residual part of an amputated upper limb, either on the forearm or the upper-arm. Such a setup enables the control of at least one degree of freedom (DOF). In order to support more DOF, a switching mechanism is used to switch between available DOF. This switching mechanism can be implemented by co-contractions or other muscle activation sequences. Although this works in principle and it is relatively simple, the downside is that the full prosthesis control has a low chance to be integrated over time into dedicated motor programs by the user brain, because of the required switching actions.

In the last years, more dexterous prosthesis components and systems emerged on the market providing more DOF, e.g., the Michelangelo®-Hand Advanced Prosthesis System (Otto Bock Healthcare Products GmbH, D), the iLimb Hand (Touch Bionics, UK), the be-bionics-Hand (RSLSteeper, UK), or the Vincent Hand (Vincent Systems GmbH, D), to name a few. An EMG controlled prosthesis consists of an inner shaft and an outer shaft. The inner shaft carries the EMG-electrodes and fits the prosthesis user stump very tightly in order to provide a vacuum in the socket for fixation. The outer shaft is made of carbon or other material for protecting the prosthesis equipment and providing the carrier for the hand component. Fitting the prosthesis to its user is a mandatory step toward a successful prosthesis utilization.

For the control of advanced devices, more signals are required, and can be obtained using additional electrodes. However, muscles do not work independently, because of synergies that include groups of two or more muscles. Therefore, separability between single muscle contractions is not naturally given and can be achieved only approximately by intense training.

4.2 Prosthesis Control, Machine and Human Learning

It is assumed that a prosthesis user has at least residual understanding of doing phantom movements [91]. In addition, motor programs are assumed to work also for voluntary controlled joint movements [43] as they work for continuously repeated movements. The more degrees of freedom a multifunctional prosthesis provides, the more factors of user performance become important. These factors originate from users‘ motor abilities, such as the discriminability of their EMG signal pattern vectors between different phantom-like and the precision of repeating them always in the same manner.

During assessments, psychometric measures of user ability and classification performance for rating user performance in laboratory [45] and real-life scenarios [4] have been applied. When a novice prosthesis user tries to perform repetitions of the same movement, using a certain joint and using the same contraction, it can happen that the resulting outcomes are not always the same. This observation can be attributed to variability in motor control.

In order to face the variability of motor control, statistics and machine leraning are often used to control robotic prostheses. To this end, it is crucial that the collected training set provides sufficient information on the realtionship between EMG signals and desired movements. Figure 4 shows three exemplary training sets: (i) a small training set (pictured in red), which can be obtained with minimal training effort; (ii) a huge training set (black), which is robust to variability but it requires a very high training effort; and (iii) a medium training set (blue), which represents a trade-off between variability and learnability. In Fig. 4, a mean-shiftFootnote 1 between training and test data D is depicted. It is possible to notice that, while there is no overlapping between the small training set and the test set, the medium training set comprises the test set, thus it can lead to satisfactory performances.

Fig. 4
figure 4

Toy example of the statistics of three training sets with different data sizes and one test set D. The small set is out of only few training trials and a small mean-shift of the test set D distribution leads to nonfunctional behavior. The huge set is robust against mean-shift, but needs too much training effort. Thus, the sufficient set uses an optimized trial set and is more robust to slightly changed distributions

In conclusion, different phantom movements should result in differentiable muscle contractions with no overlapping EMG patterns. This can only be achieved by repeated training, perhaps with visual feedback to speed up the learning process.

4.3 Optimization of Training

Training amputees to use robotic prostheses should be divided in two components: (i) training for machine learning, which identifies a mapping between EMG readings and prosthesis joint control signals; and (ii) training for human learning, which should train amputees to perform stable and repeatable phantom movements..

While literature focuses mainly on machine learning techniques for prosthesis control, in practice, the variability of user behavior is more crucial and may degrade completely the performance of a sophisticated machine learning solution. Additionally, the more functionalities a prosthesis provides to the user, the more precisely the user must perform required muscle contractions, in order to provide direct control. Recall about the assumption of benefit that users should get able to simply forget about operating a prosthesis, because direct control handling, on the long run, should seamlessly integrate directly into the motor cortex.

Applying learning methods to yet untrained subjects might cause the following problem. When the user tries to minimize their signal variability, the resulting EMG pattern may overlap those associated to other phantom movements. This would require to retrain the prosthesis user to employ different patterns across movements, which is cumbersome and should be avoided. Information on these EMG overlaps could be exploited by a physioterapist to guide the amputee to perform distinctive phantom movements  [52].

The co-optimization of machine learning and human learning seems to be a promising approach to solve this issue. Interested readers should refer to  [4, 45].

5 Motor Learning in Rehabilitation

This section describes some rehabilitation techniques based on motor-learning principles, given special attention on those supported by novel technologies as robots and electrical stimulation. These rehabilitation techniques rely on the assumption that patients with neurological lesions are able to learn by means of the plasticity of the CNS [43]. The plasticity capacity of the CNS has been demonstrated in literature [51], and it is currently exploited for rehabilitation after neurological injuries. In this regard, rehabilitation techniques attempt to exploit plasticity to achieve recovery through different motor learning concept. Patients can be exposed to combinations of sequences of different techniques based on their needs and their performance history [70].

5.1 Constraint-Induced Movement Therapy

Constraint-induced movement therapy (CIMT) is a rehabilitation therapy based on the theory of “learned nonuse.” The learned nonuse phenomenon is developed during the early stages after stroke, as the patient begins to compensate their motor function due to difficulty and inability to successfully carry out motor tasks using their impaired limb [48]. This compensation increases reliance on the intact limb hindering recovery of the impaired limb.

This rehabilitation technique has two main components and is usually given over 2 weeks [70]. The first one is to restraint the less-affected extremity, the second one is to practice with the affected limb for 6 hours a day using shaping. Although it has been demonstrated that chronic stroke patient can show significant motor improvements, the use of CIMT remain controversial [70, 120]. The main arguments arise by the facts that the restrain stage can be very frustrating and that the inclusion criteria may lead to excluding many patients. In literature, there is not a clear position regarding inclusion criteria about the patient stage after stroke, despite it has been used in acute, subacute and chronic stages [120]. The generally accepted condition is that patients must have capability of perform at least 10 degrees of finger and wrist extension.

Massive practice is the main principle behind this rehabilitation technique, in which patients are required to use their affected arm to carry out motor activities. At the beginning, this therapy could be frustrating specially in patient with high motor impairment [79]. It has been also mentioned that since the restriction the movement of the nonaffected arm yields to explore command space with the affected arm, this technique encourages the exploration of a global optimum [58].

5.2 Robotics Rehabilitation

The development of robotic therapy was driven by the evidence that the injured motor system can reorganize in the setting of motor practice (plasticity). Also, that robotics devices provide the ability to automate intensive training techniques, increasing safety for both patient and therapists, and improve user-therapy accessibility [92]. High intensity and repetitive training are the key features to promote motor learning, reduce motor impairment and enhance motor function [58, 64]. In this regard, robotics provide a great opportunity to deliver a much higher dosage of training and intensity [92].

It is worth noting that these devices allow scientists to carry out a rigorous validation and application of motor learning principles in neurorehabilitation [58]. Furthermore, robotic platforms provide the possibility to test different motor learning principles through variations of control algorithms, to create many dynamic environments, and to investigate the human ability to adapt to them [93].

It is important to note that as Reinkensmeyer and Patton introduced, guidance can impair learning because it changes the dynamics of the task to be learned [93]. They also mentioned that guidance could be very helpful to teach skilled movements that require coordinated motions of multiple joints while the vision must be kept on the target object. So the usefulness of guidance for trajectory learning may depend on the task to be learned.

Another trend widely explored regarding motor learning and robotic devices is motor adaptation. Huang and Shadmehr [58] describe motor adaptation as a learner’s reaction to a change in the environment. An example of a study that induced motor adaptation consisted in performing reaching movements under perturbations introduced by a planar robot (perturbation force field). These perturbations were perpendicular to movement directions and proportional to movement velocity [105]. After some movements under the perturbation force field, the learner modified its motor response during reaching movements. This modified response is called motor after-effect, and it has been demonstrated that the subject’s adapted response temporarily persist as if the perturbation force was still present. This after-effect technique is opposite to robotic guidance because it increases trajectory errors during movement, and thus it could be also called as an error-augmentation strategy [93].

Many studies were presented that deal with motor after-effect for both upper and lower limbs. Scheidt and Stoeckmann used the MIT-Manus to compare force field adaptation in post-stroke and healthy subjects [98]. They found that both groups utilize the same compensatory strategies evidencing that post-stroke patients have the capability to adapt their motor responses as healthy subjects do, although it may take more training. For lower limb case, Reisman et. al. carried out a study in a split-belt treadmill with chronic hemiparesis that showed asymmetry in inter-limb coordination during walking [95]. In the experiment, one belt sped up and the other slowed down. Similarly to the upper limb study, patients presented after-effects responses that improved the symmetry of their gait patterns. The last remark concerning the after-effect is that the learner does not solely adapt to environmental dynamics changes, but he/she is also able to anticipates the expected dynamics of the new environment and moves according to a new set of expectations. Huang and Shadmehr [58] mentioned that motor adaptation appears to rely on an update in the internal representation (internal model) of the external environment, and that internal model learned in the robotics force field paradigm could be retained over time.

Rehabilitation training must be executed to achieve lasting and generalizable gain of motor capabilities. The main idea behind generalization is that training task X must lead to improved performance in task Y and Z. Baraduc and Wolpert [6] performed experiments of reaching and point to a target from the same starting point using index fingers but with different initial arm configurations, and concluded that in robotic neurorehabilitation is important to train patients across different movement directions to learn a task . Wang and Sainburg [122] discovered that training under clockwise force dynamic perturbations in one arm can generalize to the other arm with counter clockwise perturbations. Hemminger et al. [21] concluded that adaptation to force dynamics can transfer only from the dominant to the nondominant arm . Regarding force field rate variation, gradual user adaptation to a force field promotes larger and longer lasting after-effects than sudden changes in the force field [65].

It is also important to remember that a variable training schedule is better than a continuous schedule as it promotes retention and generalization [68]. In the work presented by Aboukhalil et al., they conclude that motor retention is higher when training sessions are temporally distributed over a period of time [2]. Huang and Krakauer also affirmed that minutes or even hours between training sessions may facilitate consolidation of motor memories [57]. Addressing these principles, robotic therapies must be programed to combine two or more tasks in one session rather than training only one task.

Robotic rehabilitation is an ideal tool to both test and eventually implement rehabilitation paradigms to aid motor recovery after stroke and other central nervous system diseases [58]. Furthermore, it enables to deliver automated and predefined training session. However, in order to promote motor skill acquisition and retain it beyond the training session, motor control principles should be taken into account.

5.3 Triggered-Based Functional Electrical Stimulation on Rehabilitation

Instead of using a robot to drive the limb of a patient by applying external mechanical forces, neuromuscular electrical stimulation (NMES) therapy facilitates exercise execution leaded by the participant’s own muscles. NMES relies on short electrical pulses with the aim of recruiting motor neurons that generate muscle activations and hence produce movements. The intensity of the electrical pulses sets the total charge transferred to the muscle. The amount of charge driven to muscles depends on pulse shape, amplitude, width and the frequency [78]. The use of NMES has arisen as a research cooperation between disciplines like neurophysiology, engineering, and rehabilitation and others, that have promoted their development and use for rehabilitation purposes [16].

In literature, two applications based on electrical stimulations for motor relearning can be distinguished: cyclic NMES and neuroprosthetics. Cyclic NMES was defined as continuous or periodic muscle stimulation through electrical pulses. While the term neuroprosthetics involve an artificial system bypassing the neural system to restore lost body functions by providing functional movement patterns using electrical stimulation [59]. During cyclic NMES patients are passive, and they are not required to perform any cognitive effort, in the form of either initiation of muscle contraction, interpretation of afferent signals, or functionality of motor task [107]. Whereas in neuroprosthetics applications, alternative motor pathways are recruited and activated to assist the damaged efferent pathways of the central nervous system [19]. In this second strategy, repetitive movement training is performed in the context of functional behavioral tasks [16, 107].

The most popular way to perform user-triggered NMES is based on electromyography (EMG) signals. EMG-triggered stimulation consists in monitoring the activities of one or more muscles, and triggering NMES when the corresponding EMG signals overcome a predefined threshold. This technique is usually used in patients with residual motor function, where motor neural connections are still working, in a way that voluntary commands generate strong enough EMG signals that can be distinguished from its baseline activity. A step forward to this approach was presented in [1], where the stimulation was modulated in proportion to the voluntary EMG, so that a closed-loop EMG-controlled system resulted in both clinical improvement of the paretic upper extremity and cortical modulation in patients after stroke. When neural connections are too weak and the muscle baseline signal is indistinguishable from the activation stage, electroencephalography (EEG) signal could be used [55]. In this case, patients’ intentions can be detected by monitoring cortical brain activities, which then trigger the electrical stimulation to assist the movement. Two motor learning principles are coupled during voluntary NMES-triggered therapies: repetition and sensorimotor integration [70]. Furthermore, triggered NMES has also been coupled with randomized practice schedule, testing the hypothesis that contextual interference will aid recovery [19], and with bilateral coordination training [18].

Controversial results were found in literature regarding the benefits of cyclic NMES compared with a neuroprosthetics. On the one hand, De Kroon and IJzerman have not detected significant differences regarding the functional outcome [23]. On the other hand, Bolton et al. have mentioned that neuroprosthetics generated higher improvements compared with cyclic NMES [13]. Despite this discrepancy, it is globally accepted that functional improvement is enhanced when stimulation is associated with voluntary attempts [16]. Cauraugh et al. explained that improvements obtained in stroke patients after neuroprosthetic therapy can be explained in terms of the sensorimotor integration theory. In particular, neuroprosthetics movements produce proprioceptive feedback, an afferent signal that returns to the somatosensory cortex, completing the sensorimotor cycle. The voluntary efferent output as well as the afferent input may assist in organizing the distorted signals arising from the damaged brain area [17]. Indeed, proprioceptive feedback has a critical role in motor planning by updating an internal model of the state [70]. For additional details on triggered NMES studies, the reader can refer to [16, 18, 59, 107].

In order to enhance patients’ recovery, some motor learning principles must be taken into account during the use of neuroprosthetics devices. These considerations include task repetition, novelty of activity, concurrent volitional effort, and high functional content [107].