Abstract
As human beings, people coordinate movements and interact with the environment through sensory information and motor adaptation in the daily lives. Many characteristics of these interactions can be studied using optimization-based models, which assume that the precise knowledge of both the sensorimotor system and its interactive environment is available for the central nervous system (CNS). However, both static and dynamic uncertainties occur inevitably in the daily movements. When these uncertainties are taken into consideration, the previously developed models based on optimization theory may fail to explain how the CNS can still coordinate human movements which are also robust with respect to the uncertainties. In order to address this problem, this paper presents a novel computational mechanism for sensorimotor control from a perspective of robust adaptive dynamic programming (RADP). Sharing some essential features of reinforcement learning, which was originally observed from mammals, the RADP model for sensorimotor control suggests that, instead of identifying the system dynamics of both the motor system and the environment, the CNS computes iteratively a robust optimal control policy using the real-time sensory data. An online learning algorithm is provided in this paper, with rigorous convergence and stability analysis. Then, it is applied to simulate several experiments reported from the past literature. By comparing the proposed numerical results with these experimentally observed data, the authors show that the proposed model can reproduce movement trajectories which are consistent with experimental observations. In addition, the RADP theory provides a unified framework that connects optimality and robustness properties in the sensorimotor system.
Similar content being viewed by others
References
Franklin D W and Wolpert D M, Computational mechanisms of sensorimotor control, Neuron, 2011, 72(3): 425–442.
Diedrichsen J, Shadmehr R, and Ivry R B, The coordination of movement: Optimal feedback control and beyond, Trends in Cognitive Sciences, 2010, 14(1): 31–39.
Flash T and Hogan N, The coordination of arm movements: An experimentally confirmed mathematical model, The Journal of Neuroscience, 1985, 5(7): 1688–1703.
Harris C M and Wolpert D M, Signal-dependent noise determines motor planning, Nature, 1998, 394: 780–784.
Hogan N and Flash T, Moving gracefully: Quantitative theories of motor coordination, Trends in Neurosciences, 1987, 10(4): 170–174.
Jiang Y, Jiang Z P, and Qian N, Optimal control mechanisms in human arm reaching movements, Proceedings of the 30th Chinese Control Conference, Yantai, China, 2011, 1377–1382.
Qian N, Jiang Y, Jiang Z P, and Pietro Mazzoni, Movement duration, Fitts’s law, and an infinitehorizon optimal feedback control model for biological motor systems, Neural Computation, 2013, 25(3): 697–724.
Scott S H, Optimal feedback control and the neural basis of volitional motor control, Nature Reviews Neuroscience, 2004, 5(7): 532–546.
Todorov E and Jordan M I, Optimal feedback control as a theory of motor coordination, Nature Neuroscience, 2002, 5(11): 1226–1235.
Todorov E, Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system, Neural Computation, 2005, 17(5): 1084–1108.
Uno Y, Kawato M, and Suzuki R, Formation and control of optimal trajectory in human multijoint arm movement: Minimum torque-change model, Biological cybernetics, 1989, 61(2): 89–101.
Morasso P, Spatial control of arm movements, Experimental Brain Research, 1981, 42(2): 223–227.
Berniker M and Kording K, Estimating the sources of motor errors for adaptation and generalization, Nature Neuroscience, 2008, 11(12): 1454–1461.
Bhushan N and Shadmehr R, Computational nature of human adaptive control during learning of reaching movements in force fields, Biological Cybernetics, 1999, 81(1): 39–60.
Davidson P R and Wolpert D M, Motor learning and prediction in a variable environment, Current Opinion in Neurobiology, 2003, 13(2): 232–237.
Kording K P, Tenenbaum J B, and Shadmehr R, The dynamics of memory as a consequence of optimal adaptation to a changing body, Nature Neuroscience, 2007, 10(6): 779–786.
Shadmehr R and Mussa-Ivaldi F A, Adaptive representation of dynamics during learning of a motor task, The Journal of Neuroscience, 1994, 14(5): 3208–3224.
Wolpert D M and Ghahramani Z, Computational principles of movement neuroscience, Nature Neuroscience, 2000, 3: 1212–1217.
Hudson T E and Landy M S, Adaptation to sensory-motor reflex perturbations is blind to the source of errors, Journal of Vision, 2012, 12(1): 1–10.
Jiang Y and Jiang Z P, Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics, Automatica, 2012, 48(10): 2699–2704.
Jiang Y and Jiang Z P, Robust adaptive dynamic programming, Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control (ed. by Liu D and Lewis F L), John Wiley & Sons, 2012, 281–302.
Jiang Y and Jiang Z P, Robust adaptive dynamic programming and feedback stabilization of nonlinear systems, IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(5): 882–893.
Jiang Y and Jiang Z P, Robust adaptive dynamic programming with an application to power systems, IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(7): 1150–1156.
Jiang Z P and Jiang Y, Robust adaptive dynamic programming for linear and nonlinear systems: An overview, European Journal of Control, 2013, 19(5): 417–425.
Werbos P, The elements of intelligence, Cybernetica (Namur), 1968, 3: 131–178.
Werbos P, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, PhD thesis, Harvard University, 1974.
Werbos P, Neural networks for control and system identification, Proceedings of the 28th IEEE Conference on Decision and Control, Tampa, FL, 1989, 260–265.
Krstic M, Kanellakopoulos I, and Kokotovic P V, Nonlinear and Adaptive Control Design, John-Wiley, NY, 1995.
Jiang Z P and Praly L, Design of robust adaptive controllers for nonlinear systems with dynamic uncertainties, Automatica, 1998, 34(7): 825–840.
Karafyllis I and Jiang Z P, Stability and Stabilization of Nonlinear Systems, Springer, 2011.
Sutton R S and Barto A G, Reinforcement Learning: An Introduction, MIT Press, 1998.
Doya K, Kimura H, and Kawato M, Neural mechanisms of learning and control, IEEE Control Systems Magazine, 2001, 21(4): 42–54.
Bellman R E, Dynamic Programming, Princeton University Press, Princeton, NJ, 1957.
Burdet E, Osu R, Franklin D W, Milner T E, and Kawato M, The central nervous system stabilizes unstable dynamics by learning optimal impedance, Nature, 2001, 414(6862): 446–449.
Franklin D W, Burdet E, Osu R, Kawato M, and Milner T E, Functional significance of stiffness in adaptation of multijoint arm movements to stable and unstable dynamics, Experimental Brain Research, 2003, 151(2): 145–157.
Kushner H J, Stochastic Stability, Springer, Berlin Heidelberg, 1972.
Izawa J, Rane T, Donchin O, and Shadmehr R, Motor adaptation as a process of reoptimization, The Journal of Neuroscience, 2008, 28(11): 2883–2891.
Krstic M and Deng H, Stabilization of Nonlinear Uncertain Systems, Springer, 1998.
Lewis F L and Syrmos V L, Optimal Control, Wiley, 1995.
Willems J L and Willems J C, Feedback stabilizability for stochastic systems with state and control dependent noise, Automatica, 1976, 12(3): 277–283.
Krstic M and Li Z-H, Inverse optimal design of input-to-state stabilizing nonlinear controllers, IEEE Transactions on Automatic Control, 1998, 43(3): 336–350.
Kleinman D, On the stability of linear stochastic systems, IEEE Transactions on Automatic Control, 1969, 14(4): 429–430.
Itô K, Stochastic integral, Proceedings of the Japan Academy, Series A, Mathematical Sciences, 1944, 20(8): 519–524.
Horn R A and Johnson C R, Matrix Analysis, Cambridge University Press, 1990.
Ioannou P A and Sun J, Robust Adaptive Control, Prentice-Hall, Upper Saddle River, NJ, 1996.
Tao G, Adaptive Control Design and Analysis, Wiley, 2003.
Ljung L, System Identification, Wiley, 1999.
Liu D and Todorov E, Evidence for the flexible sensorimotor strategies predicted by optimal feedback control, The Journal of Neuroscience, 2007, 27(35): 9354–9368.
Mussa-Ivaldi F A, Hogan N, and Bizzi E, Neural, mechanical, and geometric factors subserving arm posture in humans, The Journal of Neuroscience, 1985, 5(10): 2732–2743.
Jiang Z P and Mareels I M Y, A small-gain control method for nonlinear cascaded systems with dynamic uncertainties, IEEE Transactions on Automatic Control, 1997, 42(3): 292–308.
Jiang Z P, Teel A R, and Praly L, Small-gain theorem for ISS systems and applications, Mathematics of Control, Signals and Systems, 1994, 7(2): 95–120.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported in part by the US National Science Foundation Grant Nos. ECCS-1101401 and ECCS-1230040.
This paper was recommended for publication by Editor LIU Yungang.
Rights and permissions
About this article
Cite this article
Jiang, Y., Jiang, ZP. A robust adaptive dynamic programming principle for sensorimotor control with signal-dependent noise. J Syst Sci Complex 28, 261–288 (2015). https://doi.org/10.1007/s11424-015-3310-2
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11424-015-3310-2