Towards Adaptive Humanoid Control via Multi-Behavior Distillation and Reinforced Fine-Tuning

Yingnan Zhao; Xinmiao Wang; Dewei Wang; Xinzhe Liu; Dan Lu; Qilong Han; Peng Liu; Chenjia Bai

doi:10.1609/aaai.v40i22.38951

Authors

Yingnan Zhao College of Computer Science and Technology, Harbin Engineering University National Engineering Laboratory for Modeling and Emulation in E-Government, Harbin Engineering University
Xinmiao Wang College of Computer Science and Technology, Harbin Engineering University Institute of Artificial Intelligence (TeleAI), China Telecom
Dewei Wang Institute of Artificial Intelligence (TeleAI), China Telecom School of Information Science and Technology, University of Science and Technology of China
Xinzhe Liu Institute of Artificial Intelligence (TeleAI), China Telecom School of Information Science and Technology, ShanghaiTech University
Dan Lu College of Computer Science and Technology, Harbin Engineering University National Engineering Laboratory for Modeling and Emulation in E-Government, Harbin Engineering University
Qilong Han College of Computer Science and Technology, Harbin Engineering University National Engineering Laboratory for Modeling and Emulation in E-Government, Harbin Engineering University
Peng Liu College of Computer Science and Technology, Harbin Institute of Technology
Chenjia Bai Institute of Artificial Intelligence (TeleAI), China Telecom Shenzhen Research Institute of Northwestern Polytechnical University

DOI:

https://doi.org/10.1609/aaai.v40i22.38951

Abstract

Humanoid robots are promising to learn a diverse set of human-like locomotion behaviors, including standing up, walking, running, and jumping. However, existing methods predominantly require training independent policies for each skill, yielding behavior-specific controllers that exhibit limited generalization and brittle performance when deployed on irregular terrains and in diverse situations. To address this challenge, we propose Adaptive Humanoid Control (AHC) that adopts a two-stage framework to learn an adaptive humanoid locomotion controller across different skills and terrains. Specifically, we first train several primary locomotion policies and perform a multi-behavior distillation process to obtain a basic multi-behavior controller, facilitating adaptive behavior switching based on the environment. Then, we perform reinforced fine-tuning by collecting online feedback in performing adaptive behaviors on more diverse terrains, enhancing terrain adaptability for the adaptive behavior controller. We conduct experiments in both simulation and real-world experiments in Unitree G1 robots. The results show that our method exhibits strong adaptability across various situations and terrains.

Towards Adaptive Humanoid Control via Multi-Behavior Distillation and Reinforced Fine-Tuning

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information