Skip to main content

Advertisement

Log in

A robust adaptive dynamic programming principle for sensorimotor control with signal-dependent noise

  • Published:
Journal of Systems Science and Complexity Aims and scope Submit manuscript

Abstract

As human beings, people coordinate movements and interact with the environment through sensory information and motor adaptation in the daily lives. Many characteristics of these interactions can be studied using optimization-based models, which assume that the precise knowledge of both the sensorimotor system and its interactive environment is available for the central nervous system (CNS). However, both static and dynamic uncertainties occur inevitably in the daily movements. When these uncertainties are taken into consideration, the previously developed models based on optimization theory may fail to explain how the CNS can still coordinate human movements which are also robust with respect to the uncertainties. In order to address this problem, this paper presents a novel computational mechanism for sensorimotor control from a perspective of robust adaptive dynamic programming (RADP). Sharing some essential features of reinforcement learning, which was originally observed from mammals, the RADP model for sensorimotor control suggests that, instead of identifying the system dynamics of both the motor system and the environment, the CNS computes iteratively a robust optimal control policy using the real-time sensory data. An online learning algorithm is provided in this paper, with rigorous convergence and stability analysis. Then, it is applied to simulate several experiments reported from the past literature. By comparing the proposed numerical results with these experimentally observed data, the authors show that the proposed model can reproduce movement trajectories which are consistent with experimental observations. In addition, the RADP theory provides a unified framework that connects optimality and robustness properties in the sensorimotor system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Franklin D W and Wolpert D M, Computational mechanisms of sensorimotor control, Neuron, 2011, 72(3): 425–442.

    Article  Google Scholar 

  2. Diedrichsen J, Shadmehr R, and Ivry R B, The coordination of movement: Optimal feedback control and beyond, Trends in Cognitive Sciences, 2010, 14(1): 31–39.

    Article  Google Scholar 

  3. Flash T and Hogan N, The coordination of arm movements: An experimentally confirmed mathematical model, The Journal of Neuroscience, 1985, 5(7): 1688–1703.

    Google Scholar 

  4. Harris C M and Wolpert D M, Signal-dependent noise determines motor planning, Nature, 1998, 394: 780–784.

    Article  Google Scholar 

  5. Hogan N and Flash T, Moving gracefully: Quantitative theories of motor coordination, Trends in Neurosciences, 1987, 10(4): 170–174.

    Article  Google Scholar 

  6. Jiang Y, Jiang Z P, and Qian N, Optimal control mechanisms in human arm reaching movements, Proceedings of the 30th Chinese Control Conference, Yantai, China, 2011, 1377–1382.

    Google Scholar 

  7. Qian N, Jiang Y, Jiang Z P, and Pietro Mazzoni, Movement duration, Fitts’s law, and an infinitehorizon optimal feedback control model for biological motor systems, Neural Computation, 2013, 25(3): 697–724.

    Article  MATH  MathSciNet  Google Scholar 

  8. Scott S H, Optimal feedback control and the neural basis of volitional motor control, Nature Reviews Neuroscience, 2004, 5(7): 532–546.

    Article  Google Scholar 

  9. Todorov E and Jordan M I, Optimal feedback control as a theory of motor coordination, Nature Neuroscience, 2002, 5(11): 1226–1235.

    Article  Google Scholar 

  10. Todorov E, Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system, Neural Computation, 2005, 17(5): 1084–1108.

    Article  MATH  MathSciNet  Google Scholar 

  11. Uno Y, Kawato M, and Suzuki R, Formation and control of optimal trajectory in human multijoint arm movement: Minimum torque-change model, Biological cybernetics, 1989, 61(2): 89–101.

    Article  Google Scholar 

  12. Morasso P, Spatial control of arm movements, Experimental Brain Research, 1981, 42(2): 223–227.

    Article  Google Scholar 

  13. Berniker M and Kording K, Estimating the sources of motor errors for adaptation and generalization, Nature Neuroscience, 2008, 11(12): 1454–1461.

    Article  Google Scholar 

  14. Bhushan N and Shadmehr R, Computational nature of human adaptive control during learning of reaching movements in force fields, Biological Cybernetics, 1999, 81(1): 39–60.

    Article  MATH  Google Scholar 

  15. Davidson P R and Wolpert D M, Motor learning and prediction in a variable environment, Current Opinion in Neurobiology, 2003, 13(2): 232–237.

    Article  Google Scholar 

  16. Kording K P, Tenenbaum J B, and Shadmehr R, The dynamics of memory as a consequence of optimal adaptation to a changing body, Nature Neuroscience, 2007, 10(6): 779–786.

    Article  Google Scholar 

  17. Shadmehr R and Mussa-Ivaldi F A, Adaptive representation of dynamics during learning of a motor task, The Journal of Neuroscience, 1994, 14(5): 3208–3224.

    Google Scholar 

  18. Wolpert D M and Ghahramani Z, Computational principles of movement neuroscience, Nature Neuroscience, 2000, 3: 1212–1217.

    Article  Google Scholar 

  19. Hudson T E and Landy M S, Adaptation to sensory-motor reflex perturbations is blind to the source of errors, Journal of Vision, 2012, 12(1): 1–10.

    Article  Google Scholar 

  20. Jiang Y and Jiang Z P, Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics, Automatica, 2012, 48(10): 2699–2704.

    Article  MATH  MathSciNet  Google Scholar 

  21. Jiang Y and Jiang Z P, Robust adaptive dynamic programming, Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control (ed. by Liu D and Lewis F L), John Wiley & Sons, 2012, 281–302.

    Google Scholar 

  22. Jiang Y and Jiang Z P, Robust adaptive dynamic programming and feedback stabilization of nonlinear systems, IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(5): 882–893.

    Article  Google Scholar 

  23. Jiang Y and Jiang Z P, Robust adaptive dynamic programming with an application to power systems, IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(7): 1150–1156.

    Article  Google Scholar 

  24. Jiang Z P and Jiang Y, Robust adaptive dynamic programming for linear and nonlinear systems: An overview, European Journal of Control, 2013, 19(5): 417–425.

    Article  MATH  Google Scholar 

  25. Werbos P, The elements of intelligence, Cybernetica (Namur), 1968, 3: 131–178.

    Google Scholar 

  26. Werbos P, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, PhD thesis, Harvard University, 1974.

    Google Scholar 

  27. Werbos P, Neural networks for control and system identification, Proceedings of the 28th IEEE Conference on Decision and Control, Tampa, FL, 1989, 260–265.

    Chapter  Google Scholar 

  28. Krstic M, Kanellakopoulos I, and Kokotovic P V, Nonlinear and Adaptive Control Design, John-Wiley, NY, 1995.

    Google Scholar 

  29. Jiang Z P and Praly L, Design of robust adaptive controllers for nonlinear systems with dynamic uncertainties, Automatica, 1998, 34(7): 825–840.

    Article  MATH  MathSciNet  Google Scholar 

  30. Karafyllis I and Jiang Z P, Stability and Stabilization of Nonlinear Systems, Springer, 2011.

    Book  MATH  Google Scholar 

  31. Sutton R S and Barto A G, Reinforcement Learning: An Introduction, MIT Press, 1998.

    Google Scholar 

  32. Doya K, Kimura H, and Kawato M, Neural mechanisms of learning and control, IEEE Control Systems Magazine, 2001, 21(4): 42–54.

    Article  Google Scholar 

  33. Bellman R E, Dynamic Programming, Princeton University Press, Princeton, NJ, 1957.

    MATH  Google Scholar 

  34. Burdet E, Osu R, Franklin D W, Milner T E, and Kawato M, The central nervous system stabilizes unstable dynamics by learning optimal impedance, Nature, 2001, 414(6862): 446–449.

    Article  Google Scholar 

  35. Franklin D W, Burdet E, Osu R, Kawato M, and Milner T E, Functional significance of stiffness in adaptation of multijoint arm movements to stable and unstable dynamics, Experimental Brain Research, 2003, 151(2): 145–157.

    Article  Google Scholar 

  36. Kushner H J, Stochastic Stability, Springer, Berlin Heidelberg, 1972.

    Google Scholar 

  37. Izawa J, Rane T, Donchin O, and Shadmehr R, Motor adaptation as a process of reoptimization, The Journal of Neuroscience, 2008, 28(11): 2883–2891.

    Article  Google Scholar 

  38. Krstic M and Deng H, Stabilization of Nonlinear Uncertain Systems, Springer, 1998.

    MATH  Google Scholar 

  39. Lewis F L and Syrmos V L, Optimal Control, Wiley, 1995.

    Google Scholar 

  40. Willems J L and Willems J C, Feedback stabilizability for stochastic systems with state and control dependent noise, Automatica, 1976, 12(3): 277–283.

    Article  MATH  MathSciNet  Google Scholar 

  41. Krstic M and Li Z-H, Inverse optimal design of input-to-state stabilizing nonlinear controllers, IEEE Transactions on Automatic Control, 1998, 43(3): 336–350.

    Article  MATH  MathSciNet  Google Scholar 

  42. Kleinman D, On the stability of linear stochastic systems, IEEE Transactions on Automatic Control, 1969, 14(4): 429–430.

    Article  MathSciNet  Google Scholar 

  43. Itô K, Stochastic integral, Proceedings of the Japan Academy, Series A, Mathematical Sciences, 1944, 20(8): 519–524.

    MATH  Google Scholar 

  44. Horn R A and Johnson C R, Matrix Analysis, Cambridge University Press, 1990.

    MATH  Google Scholar 

  45. Ioannou P A and Sun J, Robust Adaptive Control, Prentice-Hall, Upper Saddle River, NJ, 1996.

    MATH  Google Scholar 

  46. Tao G, Adaptive Control Design and Analysis, Wiley, 2003.

    Book  MATH  Google Scholar 

  47. Ljung L, System Identification, Wiley, 1999.

    Google Scholar 

  48. Liu D and Todorov E, Evidence for the flexible sensorimotor strategies predicted by optimal feedback control, The Journal of Neuroscience, 2007, 27(35): 9354–9368.

    Article  Google Scholar 

  49. Mussa-Ivaldi F A, Hogan N, and Bizzi E, Neural, mechanical, and geometric factors subserving arm posture in humans, The Journal of Neuroscience, 1985, 5(10): 2732–2743.

    Google Scholar 

  50. Jiang Z P and Mareels I M Y, A small-gain control method for nonlinear cascaded systems with dynamic uncertainties, IEEE Transactions on Automatic Control, 1997, 42(3): 292–308.

    Article  MATH  MathSciNet  Google Scholar 

  51. Jiang Z P, Teel A R, and Praly L, Small-gain theorem for ISS systems and applications, Mathematics of Control, Signals and Systems, 1994, 7(2): 95–120.

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Jiang.

Additional information

This work was supported in part by the US National Science Foundation Grant Nos. ECCS-1101401 and ECCS-1230040.

This paper was recommended for publication by Editor LIU Yungang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, Y., Jiang, ZP. A robust adaptive dynamic programming principle for sensorimotor control with signal-dependent noise. J Syst Sci Complex 28, 261–288 (2015). https://doi.org/10.1007/s11424-015-3310-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11424-015-3310-2

Keywords

Navigation