research-article

Open Access

AdaptNet: Policy Adaptation for Physics-Based Character Control

Authors:
Pei Xu

Clemson University, USA and Roblox, USA

Clemson University, USA and Roblox, USA

0000-0001-7851-3971
View Profile

,
Kaixiang Xie

McGill University, Canada

McGill University, Canada

0000-0002-5877-9374
View Profile

,
Sheldon Andrews

École de Technologie Supérieure, Canada and Roblox, USA

École de Technologie Supérieure, Canada and Roblox, USA

0000-0001-9776-117X
View Profile

,
Paul G. Kry

McGill University, Canada

McGill University, Canada

0000-0003-4176-6857
View Profile

,
Michael Neff

University of California, Davis, USA

University of California, Davis, USA

0000-0003-0226-2808
View Profile

,
Morgan Mcguire

Roblox, USA and University of Waterloo, Canada

Roblox, USA and University of Waterloo, Canada

0000-0003-1074-0953
View Profile

,
Ioannis Karamouzas

University of California, Riverside, USA

University of California, Riverside, USA

0009-0000-4315-6556
View Profile

,
Victor Zordan

Roblox, USA and Clemson University, USA

Roblox, USA and Clemson University, USA

0000-0002-7309-7013
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 42 Issue 6Article No.: 177pp 1–17https://doi.org/10.1145/3618375

Published:05 December 2023Publication History

ACM Transactions on Graphics

Abstract

Motivated by humans' ability to adapt skills in the learning of new ones, this paper presents AdaptNet, an approach for modifying the latent space of existing policies to allow new behaviors to be quickly learned from like tasks in comparison to learning from scratch. Building on top of a given reinforcement learning controller, AdaptNet uses a two-tier hierarchy that augments the original state embedding to support modest changes in a behavior and further modifies the policy network layers to make more substantive changes. The technique is shown to be effective for adapting existing physics-based controllers to a wide range of new styles for locomotion, new task targets, changes in character morphology and extensive changes in environment. Furthermore, it exhibits significant increase in learning efficiency, as indicated by greatly reduced training times when compared to training from scratch or using other approaches that modify existing policies. Code is available at https://motion-lab.github.io/AdaptNet.

Supplemental Material

papers_543s4-file3.mp4

mp4

135.9 MB

Download

Available for Download

zip

papers_543s4-file4.zip (4.7 MB)

supplemental

References

R. Abdal, Y. Qin, and P. Wonka. 2019. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?. In Proc. of the IEEE/CVF Int. Conf. on Computer Vision. 4432--4441.Google Scholar
K. Aberman, Y. Weng, D. Lischinski, D. Cohen-Or, and B. Chen. 2020. Unpaired Motion Style Transfer from Video to Animation. ACM Trans. Graph. 39, 4 (2020).Google ScholarDigital Library
A. Aghajanyan, S. Gupta, and L. Zettlemoyer. 2021. Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning. In 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 7319--7328.Google Scholar
F. Alet, T. Lozano-Perez, and L. P. Kaelbling. 2018. Modular meta-learning. In Conf. on Robot Learning (Proc. of Machine Learning Research, Vol. 87). 856--868.Google Scholar
M. Andrychowicz, M. Denil, S. G. Colmenarejo, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. de Freitas. 2016. Learning to Learn by Gradient Descent by Gradient Descent. In Neural Information Processing Systems. 3988--3996.Google Scholar
K. Bergamin, S. Clavet, D. Holden, and J. R. Forbes. 2019. DReCon: Data-Driven Responsive Control of Physics-Based Characters. ACM Trans. Graph. 38, 6 (2019).Google ScholarDigital Library
D. Berthelot, T. Schumm, and L. Metz. 2017. BEGAN: Boundary Equilibrium Generative Adversarial Networks. arXiv:1703.10717 [cs.LG]Google Scholar
P. Bojanowski, A. Joulin, D. Lopez-Pas, and A. Szlam. 2018. Optimizing the Latent Space of Generative Networks. In Int. Conf. on Machine Learning (Proc. of Machine Learning Research, Vol. 80). 600--609.Google Scholar
J. Chemin and J. Lee. 2018. A Physics-Based Juggling Simulation Using Reinforcement Learning. In ACM SIGGRAPH Conf. on Motion, Interaction and Games. Article 3.Google Scholar
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. In NIPS 2014 Workshop on Deep Learning.Google Scholar
C. Devin, A. Gupta, T. Darrell, P. Abbeel, and S. Levine. 2017. Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer. In IEEE Int. Conf. on Robotics and Automation. 2169--2176.Google Scholar
Y. Duan, J. Schulman, X. Chen, P. L. Bartlett, I. Sutskever, and P. Abbeel. 2016. RL²: Fast Reinforcement Learning via Slow Reinforcement Learning. arXiv:1611.02779 [cs.AI]Google Scholar
D. Epstein, T. Park, R. Zhang, E. Shechtman, and A. A. Efros. 2022. BlobGAN: Spatially Disentangled Scene Representations. In Computer Vision - ECCV 2022. 616--635.Google Scholar
C. Finn, P. Abbeel, and S. Levine. 2017. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Int. Conf. on Machine Learning. 1126--1135.Google Scholar
A. Frezzato, A. Tangri, and S. Andrews. 2022. Synthesizing Get-Up Motions for Physics-based Characters. Comput. Graph. Forum 41, 8 (2022), 207--218.Google ScholarCross Ref
Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. Journal of Machine Learning Research 17, 1 (2016), 2096--2030.Google ScholarDigital Library
I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. 2017. Improved Training of Wasserstein GANs. In Neural Information Processing Systems, Vol. 30. 5769--5779.Google Scholar
A. Gupta, R. Mendonca, Y. Liu, P. Abbeel, and S. Levine. 2018. Meta-Reinforcement Learning of Structured Exploration Strategies. In Neural Information Processing Systems, Vol. 31. 5307--5316.Google Scholar
T. Haarnoja, H. Tang, P. Abbeel, and S. Levine. 2017. Reinforcement Learning with Deep Energy-Based Policies. In Int. Conf. on Machine Learning. 1352--1361.Google Scholar
T. Harada, S. Taoka, T. Mori, and T. Sato. 2004. Quantitative Evaluation Method for Pose and Motion Similarity Based on Human Perception. In IEEE/RAS Int. Conf. on Humanoid Robots, Vol. 1. 494--512.Google Scholar
F. G. Harvey, M. Yurick, D. Nowrouzezahrai, and C. Pal. 2020. Robust Motion In-betweening. ACM Trans. Graph. 39, 4, Article 60 (2020).Google ScholarDigital Library
N. Heess, J. J. Hunt, T. P. Lillicrap, and D. Silver. 2015. Memory-based control with recurrent neural networks. arXiv:1512.04455 [cs.LG]Google Scholar
N. Heess, D. TB, S. Sriram, J. Lemmon, J. Merel, G. Wayne, Y. Tassa, T. Erez, Z. Wang, S. M. A. Eslami, M. Riedmiller, and D. Silver. 2017. Emergence of Locomotion Behaviours in Rich Environments. arXiv:1707.02286 [cs.AI]Google Scholar
D. Hejna, L. Pinto, and P. Abbeel. 2020. Hierarchically Decoupled Imitation For Morphological Transfer. In 37th Int. Conf. on Machine Learning, Vol. 119. 4159--4171.Google Scholar
J. Ho and S. Ermon. 2016. Generative Adversarial Imitation Learning. Advances in Neural Information Processing Systems 29 (2016).Google Scholar
R. Houthooft, Y. Chen, P. Isola, B. Stadie, F. Wolski, O. Jonathan Ho, and P. Abbeel. 2018. Evolved policy gradients. In Neural Information Processing Systems, Vol. 31. 5405--5414.Google Scholar
E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685 [cs.CL]Google Scholar
A. Jahanian, L. Chai, and P. Isola. 2020. On the "Steerability" of Generative Adversarial Networks. In Int. Conf. on Learning Representations.Google Scholar
J. Juravsky, Y. Guo, S. Fidler, and X. B. Peng. 2022. PADL: Language-Directed Physics-Based Character Control. In SIGGRAPH Asia 2022 Conf. Papers. Article 19.Google Scholar
A. Karpathy and M. van de Panne. 2012. Curriculum Learning for Motor Skills. In Canadian Conf. on Artificial Intelligence. Springer, 325--330.Google Scholar
T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila. 2020. Analyzing and Improving the Image Quality of StyleGAN. In Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 8110--8119.Google Scholar
D. P. Kingma and J. Ba. 2017. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs.LG]Google Scholar
A. Kwiatkowski, E. Alvarado, V. Kalogeiton, C. K. Liu, J. Pettré, M. van de Panne, and M.-P. Cani. 2022. A Survey on Reinforcement Learning Methods in Character Animation. Comput. Graph. Forum 41, 2 (2022), 613--639.Google ScholarCross Ref
C. Li, H. Farkhoor, R. Liu, and J. Yosinski. 2018. Measuring the Intrinsic Dimension of Objective Landscapes. In Int. Conf. on Learning Representations.Google Scholar
J. H. Lim and J. C. Ye. 2017. Geometric GAN. arXiv:1705.02894 [stat.ML]Google Scholar
H. Y. Ling, F. Zinno, G. Cheng, and M. van de Panne. 2020. Character controllers using motion VAEs. ACM Trans. Graph. 39, 4, Article 40 (2020).Google ScholarDigital Library
L. Liu and J. Hodgins. 2017. Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning. ACM Trans. Graph. 36, 4, Article 42a (2017).Google ScholarDigital Library
L. Liu and J. Hodgins. 2018. Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning. ACM Trans. Graph. 37, 4, Article 142 (2018), 14 pages.Google ScholarDigital Library
Y. Luo, K. Xie, S. Andrews, and P. Kry. 2021. Catching and Throwing Control of a Physically Simulated Hand. In ACM SIGGRAPH Conf. on Motion, Interaction and Games. Article 15.Google Scholar
V. Makoviychuk, L. Wawrzyniak, Y. Guo, M. Lu, K. Storey, M. Macklin, D. Hoeller, N. Rudin, A. Allshire, A. Handa, and G. State. 2021. Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning. arXiv:2108.10470 [cs.RO]Google Scholar
I. Mason, S. Starke, H. Zhang, H. Bilen, and T. Komura. 2018. Few-shot Learning of Homogeneous Human Locomotion Styles. Comput. Graph. Forum 37, 7 (2018), 143--153.Google ScholarCross Ref
J. Merel, Y. Tassa, D. TB, S. Srinivasan, J. Lemmon, Z. Wang, G. Wayne, and N. Heess. 2017. Learning human behaviors from motion capture by adversarial imitation. arXiv:1707.02201 [cs.RO]Google Scholar
J. Merel, S. Tunyasuvunakool, A. Ahuja, Y. Tassa, L. Hasenclever, V. Pham, T. Erez, G. Wayne, and N. Heess. 2020. Catch & Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks. ACM Trans. Graph. 39, 4, Article 39 (2020).Google ScholarDigital Library
C. Mou, X. Wang, L. Xie, Y. Wu, J. Zhang, Z. Qi, Y. Shan, and X. Qie. 2023. T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models. arXiv:2302.08453 [cs.CV]Google Scholar
A. Nichol, J. Achiam, and J. Schulman. 2018. On First-Order Meta-Learning Algorithms. arXiv:1803.02999 [cs.LG]Google Scholar
E. Parisotto, L. J. Ba, and R. Salakhutdinov. 2016. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning. In Int. Conf. on Learning Representations.Google Scholar
X. B. Peng, P. Abbeel, S. Levine, and M. van de Panne. 2018a. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills. ACM Trans. Graph. 37, 4, Article 143 (2018).Google ScholarDigital Library
X. B. Peng, M. Andrychowicz, W. Zaremba, and P. Abbeel. 2018b. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization. In IEEE Int. Conf. on Robotics and Automation. 3803--3810.Google Scholar
X. B. Peng, M. Chang, G. Zhang, P. Abbeel, and S. Levine. 2019. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies. In Advances in Neural Information Processing Systems. 3681--3692.Google Scholar
X. B. Peng, Y. Guo, L. Halper, S. Levine, and S. Fidler. 2022. ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters. ACM Trans. Graph. 41, 4, Article 94 (2022).Google ScholarDigital Library
X. B. Peng, Z. Ma, P. Abbeel, S. Levine, and A. Kanazawa. 2021. AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control. ACM Trans. Graph. 40, 4, Article 144 (2021).Google ScholarDigital Library
P. Pope, C. Zhu, A. Abdelkader, M. Goldblum, and T. Goldstein. 2021. The Intrinsic Dimension of Images and Its Impact on Learning. In Int. Conf. on Learning Representations.Google Scholar
A. Radford, L. Metz, and S. Chintala. 2016. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv:1511.06434 [cs.LG]Google Scholar
A. Rajeswaran, S. Ghotra, B. Ravindran, and S. Levine. 2017. EPOpt: Learning Robust Neural Network Policies Using Model Ensembles. In Int. Conf. on Learning Representations.Google Scholar
S. Ravi and H. Larochelle. 2017. Optimization as a Model for Few-Shot Learning. In Int. Conf. on Learning Representations.Google Scholar
A. A. Rusu, S. G. Colmenarejo, C. Gulcehre, G. Desjardins, J. Kirkpatrick, R. Pascanu, V. Mnih, K. Kavukcuoglu, and R. Hadsell. 2016a. Policy Distillation. arXiv:1511.06295 [cs.LG]Google Scholar
A. A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell. 2016b. Progressive Neural Networks. arXiv:1606.04671 [cs.LG]Google Scholar
A. A. Rusu, M. Večerík, T. Rothörl, N. Heess, R. Pascanu, and R. Hadsell. 2017. Sim-to-Real Robot Learning from Pixels with Progressive Nets. In Conf. on Robot Learning. 262--270.Google Scholar
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv:1707.06347 [cs.LG]Google Scholar
Y. Shen, J. Gu, X. Tang, and B. Zhou. 2020. Interpreting the Latent Space of GANs for Semantic Face Editing. In Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition. 9243--9252.Google Scholar
T. Silver, K. Allen, J. Tenenbaum, and L. Kaelbling. 2019. Residual Policy Learning. arXiv:1812.06298 [cs.RO]Google Scholar
S. Starke, I. Mason, and T. Komura. 2022. DeepPhase: Periodic Autoencoders for Learning Motion Phase Manifolds. ACM Trans. Graph. 41, 4, Article 136 (2022).Google ScholarDigital Library
J. K. Tang, H. Leung, T. Komura, and H. P. Shum. 2008. Emulating human perception of motion similarity. Computer Animation and Virtual Worlds 19, 3--4 (2008), 211--221.Google ScholarCross Ref
T. Tao, M. Wilson, R. Gou, and M. van de Panne. 2022. Learning to Get Up. In ACM SIGGRAPH 2022 Conf. Proceedings. Article 47.Google Scholar
C. Tessler, Y. Kasten, Y. Guo, S. Mannor, G. Chechik, and X. B. Peng. 2023. CALM: Conditional Adversarial Latent Models for Directable Virtual Characters. In ACM SIGGRAPH 2023 Conf. Proceedings. Article 37.Google Scholar
E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell. 2017. Adversarial Discriminative Domain Adaptation. In IEEE Conf. on Computer Vision and Pattern Recognition. 2962--2971.Google Scholar
D. Wang, E. Shelhamer, S. Liu, B. A. Olshausen, and T. Darrell. 2021. Tent: Fully TestTime Adaptation by Entropy Minimization. In Int. Conf. on Learning Representations.Google Scholar
J. Won, D. Gopinath, and J. Hodgins. 2021. Control Strategies for Physically Simulated Characters Performing Two-Player Competitive Sports. ACM Trans. Graph. 40, 4, Article 146 (2021).Google ScholarDigital Library
J. Wu, C. Zhang, T. Xue, B. Freeman, and J. Tenenbaum. 2016. Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling. In Advances in Neural Information Processing Systems, Vol. 29.Google Scholar
Z. Xie, H. Y. Ling, N. H. Kim, and M. van de Panne. 2020. ALLSTEPS: Curriculum-driven Learning of Stepping Stone Skills. Comput. Graph. Forum 39, 8 (2020), 213--224.Google ScholarDigital Library
Z. Xie, S. Starke, H. Y. Ling, and M. van de Panne. 2022. Learning Soccer Juggling Skills with Layer-Wise Mixture-of-Experts. In ACM SIGGRAPH 2022 Conf. Proceedings. Article 25.Google Scholar
P. Xu and I. Karamouzas. 2021. A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control. Proc. of the ACM on Computer Graphics and Interactive Techniques 4, 3, Article 44 (2021).Google Scholar
P. Xu, X. Shang, V. Zordan, and I. Karamouzas. 2023. Composite Motion Learning with Task Control. ACM Trans. Graph. 42, 4, Article 93 (2023).Google ScholarDigital Library
Z. Xu, H. P. van Hasselt, and D. Silver. 2018. Meta-Gradient Reinforcement Learning. In Advances in Neural Information Processing Systems, Vol. 31.Google Scholar
H. Yao, Z. Song, B. Chen, and L. Liu. 2022. ControlVAE: Model-Based Learning of Generative Controllers for Physics-Based Characters. ACM Trans. Graph. 41, 6, Article 183 (2022).Google ScholarDigital Library
Z. Yin, Z. Yang, M. van de Panne, and K. Yin. 2021. Discovering Diverse Athletic Jumping Strategies. ACM Trans. Graph. 40, 4, Article 91 (2021).Google ScholarDigital Library
W. Yu, G. Turk, and C. K. Liu. 2018. Learning Symmetric and Low-Energy Locomotion. ACM Trans. Graph. 37, 4, Article 144 (2018).Google ScholarDigital Library
L. Zhang and M. Agrawala. 2023. Adding Conditional Control to Text-to-Image Diffusion Models. arXiv:2302.05543 [cs.CV]Google Scholar
P. Zhuang, O. O. Koyejo, and A. Schwing. 2021. Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation. In Int. Conf. on Learning Representations.Google Scholar

Index Terms

AdaptNet: Policy Adaptation for Physics-Based Character Control
1. Computing methodologies
  1. Computer graphics
    1. Animation
      1. Physical simulation
  2. Machine learning
    1. Learning paradigms
      1. Reinforcement learning

Recommendations

Composite Motion Learning with Task Control

We present a deep learning method for composite and task-driven motion control for physically simulated characters. In contrast to existing data-driven approaches using reinforcement learning that imitate full-body motions, we learn decoupled motions for ...
Read More
How to train your dragon: example-guided control of flapping flight

Imaginary winged creatures in computer animation applications are expected to perform a variety of motor skills in a physically realistic and controllable manner. Designing physics-based controllers for a flying creature is still very challenging ...
Read More
A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

We present a simple and intuitive approach for interactive control of physically simulated characters. Our work builds upon generative adversarial networks (GAN) and reinforcement learning, and introduces an imitation learning framework where an ensemble ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 42, Issue 6
December 2023
1565 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3632123
Issue’s Table of Contents

Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 December 2023
Published in tog Volume 42, Issue 6

Check for updates
Author Tags
GAN
character animation
domain adaptation
motion style transfer
motion synthesis
physics-based control
reinforcement learning
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 338
  Total Downloads
- Downloads (Last 12 months)338
- Downloads (Last 6 weeks)79
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

AdaptNet: Policy Adaptation for Physics-Based Character Control

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Composite Motion Learning with Task Control

How to train your dragon: example-guided control of flapping flight

A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

AdaptNet: Policy Adaptation for Physics-Based Character Control

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Composite Motion Learning with Task Control

How to train your dragon: example-guided control of flapping flight

A GAN-Like Approach for Physics-Based Imitation Learning and Interactive Character Control

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media