skip to main content
research-article
Open Access

AdaptNet: Policy Adaptation for Physics-Based Character Control

Published:05 December 2023Publication History
Skip Abstract Section

Abstract

Motivated by humans' ability to adapt skills in the learning of new ones, this paper presents AdaptNet, an approach for modifying the latent space of existing policies to allow new behaviors to be quickly learned from like tasks in comparison to learning from scratch. Building on top of a given reinforcement learning controller, AdaptNet uses a two-tier hierarchy that augments the original state embedding to support modest changes in a behavior and further modifies the policy network layers to make more substantive changes. The technique is shown to be effective for adapting existing physics-based controllers to a wide range of new styles for locomotion, new task targets, changes in character morphology and extensive changes in environment. Furthermore, it exhibits significant increase in learning efficiency, as indicated by greatly reduced training times when compared to training from scratch or using other approaches that modify existing policies. Code is available at https://motion-lab.github.io/AdaptNet.

Skip Supplemental Material Section

Supplemental Material

papers_543s4-file3.mp4

mp4

135.9 MB

References

  1. R. Abdal, Y. Qin, and P. Wonka. 2019. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?. In Proc. of the IEEE/CVF Int. Conf. on Computer Vision. 4432--4441.Google ScholarGoogle Scholar
  2. K. Aberman, Y. Weng, D. Lischinski, D. Cohen-Or, and B. Chen. 2020. Unpaired Motion Style Transfer from Video to Animation. ACM Trans. Graph. 39, 4 (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Aghajanyan, S. Gupta, and L. Zettlemoyer. 2021. Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning. In 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 7319--7328.Google ScholarGoogle Scholar
  4. F. Alet, T. Lozano-Perez, and L. P. Kaelbling. 2018. Modular meta-learning. In Conf. on Robot Learning (Proc. of Machine Learning Research, Vol. 87). 856--868.Google ScholarGoogle Scholar
  5. M. Andrychowicz, M. Denil, S. G. Colmenarejo, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. de Freitas. 2016. Learning to Learn by Gradient Descent by Gradient Descent. In Neural Information Processing Systems. 3988--3996.Google ScholarGoogle Scholar
  6. K. Bergamin, S. Clavet, D. Holden, and J. R. Forbes. 2019. DReCon: Data-Driven Responsive Control of Physics-Based Characters. ACM Trans. Graph. 38, 6 (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Berthelot, T. Schumm, and L. Metz. 2017. BEGAN: Boundary Equilibrium Generative Adversarial Networks. arXiv:1703.10717 [cs.LG]Google ScholarGoogle Scholar
  8. P. Bojanowski, A. Joulin, D. Lopez-Pas, and A. Szlam. 2018. Optimizing the Latent Space of Generative Networks. In Int. Conf. on Machine Learning (Proc. of Machine Learning Research, Vol. 80). 600--609.Google ScholarGoogle Scholar
  9. J. Chemin and J. Lee. 2018. A Physics-Based Juggling Simulation Using Reinforcement Learning. In ACM SIGGRAPH Conf. on Motion, Interaction and Games. Article 3.Google ScholarGoogle Scholar
  10. J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. In NIPS 2014 Workshop on Deep Learning.Google ScholarGoogle Scholar
  11. C. Devin, A. Gupta, T. Darrell, P. Abbeel, and S. Levine. 2017. Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer. In IEEE Int. Conf. on Robotics and Automation. 2169--2176.Google ScholarGoogle Scholar
  12. Y. Duan, J. Schulman, X. Chen, P. L. Bartlett, I. Sutskever, and P. Abbeel. 2016. RL2: Fast Reinforcement Learning via Slow Reinforcement Learning. arXiv:1611.02779 [cs.AI]Google ScholarGoogle Scholar
  13. D. Epstein, T. Park, R. Zhang, E. Shechtman, and A. A. Efros. 2022. BlobGAN: Spatially Disentangled Scene Representations. In Computer Vision - ECCV 2022. 616--635.Google ScholarGoogle Scholar
  14. C. Finn, P. Abbeel, and S. Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Int. Conf. on Machine Learning. 1126--1135.Google ScholarGoogle Scholar
  15. A. Frezzato, A. Tangri, and S. Andrews. 2022. Synthesizing Get-Up Motions for Physics-based Characters. Comput. Graph. Forum 41, 8 (2022), 207--218.Google ScholarGoogle ScholarCross RefCross Ref
  16. Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. Journal of Machine Learning Research 17, 1 (2016), 2096--2030.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. 2017. Improved Training of Wasserstein GANs. In Neural Information Processing Systems, Vol. 30. 5769--5779.Google ScholarGoogle Scholar
  18. A. Gupta, R. Mendonca, Y. Liu, P. Abbeel, and S. Levine. 2018. Meta-Reinforcement Learning of Structured Exploration Strategies. In Neural Information Processing Systems, Vol. 31. 5307--5316.Google ScholarGoogle Scholar
  19. T. Haarnoja, H. Tang, P. Abbeel, and S. Levine. 2017. Reinforcement Learning with Deep Energy-Based Policies. In Int. Conf. on Machine Learning. 1352--1361.Google ScholarGoogle Scholar
  20. T. Harada, S. Taoka, T. Mori, and T. Sato. 2004. Quantitative Evaluation Method for Pose and Motion Similarity Based on Human Perception. In IEEE/RAS Int. Conf. on Humanoid Robots, Vol. 1. 494--512.Google ScholarGoogle Scholar
  21. F. G. Harvey, M. Yurick, D. Nowrouzezahrai, and C. Pal. 2020. Robust Motion In-betweening. ACM Trans. Graph. 39, 4, Article 60 (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. Heess, J. J. Hunt, T. P. Lillicrap, and D. Silver. 2015. Memory-based control with recurrent neural networks. arXiv:1512.04455 [cs.LG]Google ScholarGoogle Scholar
  23. N. Heess, D. TB, S. Sriram, J. Lemmon, J. Merel, G. Wayne, Y. Tassa, T. Erez, Z. Wang, S. M. A. Eslami, M. Riedmiller, and D. Silver. 2017. Emergence of Locomotion Behaviours in Rich Environments. arXiv:1707.02286 [cs.AI]Google ScholarGoogle Scholar
  24. D. Hejna, L. Pinto, and P. Abbeel. 2020. Hierarchically Decoupled Imitation For Morphological Transfer. In 37th Int. Conf. on Machine Learning, Vol. 119. 4159--4171.Google ScholarGoogle Scholar
  25. J. Ho and S. Ermon. 2016. Generative Adversarial Imitation Learning. Advances in Neural Information Processing Systems 29 (2016).Google ScholarGoogle Scholar
  26. R. Houthooft, Y. Chen, P. Isola, B. Stadie, F. Wolski, O. Jonathan Ho, and P. Abbeel. 2018. Evolved policy gradients. In Neural Information Processing Systems, Vol. 31. 5405--5414.Google ScholarGoogle Scholar
  27. E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685 [cs.CL]Google ScholarGoogle Scholar
  28. A. Jahanian, L. Chai, and P. Isola. 2020. On the "Steerability" of Generative Adversarial Networks. In Int. Conf. on Learning Representations.Google ScholarGoogle Scholar
  29. J. Juravsky, Y. Guo, S. Fidler, and X. B. Peng. 2022. PADL: Language-Directed Physics-Based Character Control. In SIGGRAPH Asia 2022 Conf. Papers. Article 19.Google ScholarGoogle Scholar
  30. A. Karpathy and M. van de Panne. 2012. Curriculum Learning for Motor Skills. In Canadian Conf. on Artificial Intelligence. Springer, 325--330.Google ScholarGoogle Scholar
  31. T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila. 2020. Analyzing and Improving the Image Quality of StyleGAN. In Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 8110--8119.Google ScholarGoogle Scholar
  32. D. P. Kingma and J. Ba. 2017. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs.LG]Google ScholarGoogle Scholar
  33. A. Kwiatkowski, E. Alvarado, V. Kalogeiton, C. K. Liu, J. Pettré, M. van de Panne, and M.-P. Cani. 2022. A Survey on Reinforcement Learning Methods in Character Animation. Comput. Graph. Forum 41, 2 (2022), 613--639.Google ScholarGoogle ScholarCross RefCross Ref
  34. C. Li, H. Farkhoor, R. Liu, and J. Yosinski. 2018. Measuring the Intrinsic Dimension of Objective Landscapes. In Int. Conf. on Learning Representations.Google ScholarGoogle Scholar
  35. J. H. Lim and J. C. Ye. 2017. Geometric GAN. arXiv:1705.02894 [stat.ML]Google ScholarGoogle Scholar
  36. H. Y. Ling, F. Zinno, G. Cheng, and M. van de Panne. 2020. Character controllers using motion VAEs. ACM Trans. Graph. 39, 4, Article 40 (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. L. Liu and J. Hodgins. 2017. Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning. ACM Trans. Graph. 36, 4, Article 42a (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. L. Liu and J. Hodgins. 2018. Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning. ACM Trans. Graph. 37, 4, Article 142 (2018), 14 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Y. Luo, K. Xie, S. Andrews, and P. Kry. 2021. Catching and Throwing Control of a Physically Simulated Hand. In ACM SIGGRAPH Conf. on Motion, Interaction and Games. Article 15.Google ScholarGoogle Scholar
  40. V. Makoviychuk, L. Wawrzyniak, Y. Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa, and G. State. 2021. Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning. arXiv:2108.10470 [cs.RO]Google ScholarGoogle Scholar
  41. I. Mason, S. Starke, H. Zhang, H. Bilen, and T. Komura. 2018. Few-shot Learning of Homogeneous Human Locomotion Styles. Comput. Graph. Forum 37, 7 (2018), 143--153.Google ScholarGoogle ScholarCross RefCross Ref
  42. J. Merel, Y. Tassa, D. TB, S. Srinivasan, J. Lemmon, Z. Wang, G. Wayne, and N. Heess. 2017. Learning human behaviors from motion capture by adversarial imitation. arXiv:1707.02201 [cs.RO]Google ScholarGoogle Scholar
  43. J. Merel, S. Tunyasuvunakool, A. Ahuja, Y. Tassa, L. Hasenclever, V. Pham, T. Erez, G. Wayne, and N. Heess. 2020. Catch & Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks. ACM Trans. Graph. 39, 4, Article 39 (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. C. Mou, X. Wang, L. Xie, Y. Wu, J. Zhang, Z. Qi, Y. Shan, and X. Qie. 2023. T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models. arXiv:2302.08453 [cs.CV]Google ScholarGoogle Scholar
  45. A. Nichol, J. Achiam, and J. Schulman. 2018. On First-Order Meta-Learning Algorithms. arXiv:1803.02999 [cs.LG]Google ScholarGoogle Scholar
  46. E. Parisotto, L. J. Ba, and R. Salakhutdinov. 2016. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning. In Int. Conf. on Learning Representations.Google ScholarGoogle Scholar
  47. X. B. Peng, P. Abbeel, S. Levine, and M. van de Panne. 2018a. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills. ACM Trans. Graph. 37, 4, Article 143 (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. X. B. Peng, M. Andrychowicz, W. Zaremba, and P. Abbeel. 2018b. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization. In IEEE Int. Conf. on Robotics and Automation. 3803--3810.Google ScholarGoogle Scholar
  49. X. B. Peng, M. Chang, G. Zhang, P. Abbeel, and S. Levine. 2019. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies. In Advances in Neural Information Processing Systems. 3681--3692.Google ScholarGoogle Scholar
  50. X. B. Peng, Y. Guo, L. Halper, S. Levine, and S. Fidler. 2022. ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters. ACM Trans. Graph. 41, 4, Article 94 (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. X. B. Peng, Z. Ma, P. Abbeel, S. Levine, and A. Kanazawa. 2021. AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control. ACM Trans. Graph. 40, 4, Article 144 (2021).Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. P. Pope, C. Zhu, A. Abdelkader, M. Goldblum, and T. Goldstein. 2021. The Intrinsic Dimension of Images and Its Impact on Learning. In Int. Conf. on Learning Representations.Google ScholarGoogle Scholar
  53. A. Radford, L. Metz, and S. Chintala. 2016. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv:1511.06434 [cs.LG]Google ScholarGoogle Scholar
  54. A. Rajeswaran, S. Ghotra, B. Ravindran, and S. Levine. 2017. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles. In Int. Conf. on Learning Representations.Google ScholarGoogle Scholar
  55. S. Ravi and H. Larochelle. 2017. Optimization as a Model for Few-Shot Learning. In Int. Conf. on Learning Representations.Google ScholarGoogle Scholar
  56. A. A. Rusu, S. G. Colmenarejo, C. Gulcehre, G. Desjardins, J. Kirkpatrick, R. Pascanu, V. Mnih, K. Kavukcuoglu, and R. Hadsell. 2016a. Policy Distillation. arXiv:1511.06295 [cs.LG]Google ScholarGoogle Scholar
  57. A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell. 2016b. Progressive Neural Networks. arXiv:1606.04671 [cs.LG]Google ScholarGoogle Scholar
  58. A. A. Rusu, M. Večerík, T. Rothörl, N. Heess, R. Pascanu, and R. Hadsell. 2017. Sim-to-Real Robot Learning from Pixels with Progressive Nets. In Conf. on Robot Learning. 262--270.Google ScholarGoogle Scholar
  59. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv:1707.06347 [cs.LG]Google ScholarGoogle Scholar
  60. Y. Shen, J. Gu, X. Tang, and B. Zhou. 2020. Interpreting the Latent Space of GANs for Semantic Face Editing. In Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 9243--9252.Google ScholarGoogle Scholar
  61. T. Silver, K. Allen, J. Tenenbaum, and L. Kaelbling. 2019. Residual Policy Learning. arXiv:1812.06298 [cs.RO]Google ScholarGoogle Scholar
  62. S. Starke, I. Mason, and T. Komura. 2022. DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds. ACM Trans. Graph. 41, 4, Article 136 (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. J. K. Tang, H. Leung, T. Komura, and H. P. Shum. 2008. Emulating human perception of motion similarity. Computer Animation and Virtual Worlds 19, 3--4 (2008), 211--221.Google ScholarGoogle ScholarCross RefCross Ref
  64. T. Tao, M. Wilson, R. Gou, and M. van de Panne. 2022. Learning to Get Up. In ACM SIGGRAPH 2022 Conf. Proceedings. Article 47.Google ScholarGoogle Scholar
  65. C. Tessler, Y. Kasten, Y. Guo, S. Mannor, G. Chechik, and X. B. Peng. 2023. CALM: Conditional Adversarial Latent Models for Directable Virtual Characters. In ACM SIGGRAPH 2023 Conf. Proceedings. Article 37.Google ScholarGoogle Scholar
  66. E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell. 2017. Adversarial Discriminative Domain Adaptation. In IEEE Conf. on Computer Vision and Pattern Recognition. 2962--2971.Google ScholarGoogle Scholar
  67. D. Wang, E. Shelhamer, S. Liu, B. A. Olshausen, and T. Darrell. 2021. Tent: Fully TestTime Adaptation by Entropy Minimization. In Int. Conf. on Learning Representations.Google ScholarGoogle Scholar
  68. J. Won, D. Gopinath, and J. Hodgins. 2021. Control Strategies for Physically Simulated Characters Performing Two-Player Competitive Sports. ACM Trans. Graph. 40, 4, Article 146 (2021).Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. J. Wu, C. Zhang, T. Xue, B. Freeman, and J. Tenenbaum. 2016. Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling. In Advances in Neural Information Processing Systems, Vol. 29.Google ScholarGoogle Scholar
  70. Z. Xie, H. Y. Ling, N. H. Kim, and M. van de Panne. 2020. ALLSTEPS: Curriculum-driven Learning of Stepping Stone Skills. Comput. Graph. Forum 39, 8 (2020), 213--224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Z. Xie, S. Starke, H. Y. Ling, and M. van de Panne. 2022. Learning Soccer Juggling Skills with Layer-Wise Mixture-of-Experts. In ACM SIGGRAPH 2022 Conf. Proceedings. Article 25.Google ScholarGoogle Scholar
  72. P. Xu and I. Karamouzas. 2021. A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control. Proc. of the ACM on Computer Graphics and Interactive Techniques 4, 3, Article 44 (2021).Google ScholarGoogle Scholar
  73. P. Xu, X. Shang, V. Zordan, and I. Karamouzas. 2023. Composite Motion Learning with Task Control. ACM Trans. Graph. 42, 4, Article 93 (2023).Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Z. Xu, H. P. van Hasselt, and D. Silver. 2018. Meta-Gradient Reinforcement Learning. In Advances in Neural Information Processing Systems, Vol. 31.Google ScholarGoogle Scholar
  75. H. Yao, Z. Song, B. Chen, and L. Liu. 2022. ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters. ACM Trans. Graph. 41, 6, Article 183 (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Z. Yin, Z. Yang, M. van de Panne, and K. Yin. 2021. Discovering Diverse Athletic Jumping Strategies. ACM Trans. Graph. 40, 4, Article 91 (2021).Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. W. Yu, G. Turk, and C. K. Liu. 2018. Learning Symmetric and Low-Energy Locomotion. ACM Trans. Graph. 37, 4, Article 144 (2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. L. Zhang and M. Agrawala. 2023. Adding Conditional Control to Text-to-Image Diffusion Models. arXiv:2302.05543 [cs.CV]Google ScholarGoogle Scholar
  79. P. Zhuang, O. O. Koyejo, and A. Schwing. 2021. Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation. In Int. Conf. on Learning Representations.Google ScholarGoogle Scholar

Index Terms

  1. AdaptNet: Policy Adaptation for Physics-Based Character Control

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Article Metrics

        • Downloads (Last 12 months)338
        • Downloads (Last 6 weeks)79

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader