Learning Multi-Modal Whole-Body Control for Real-World
Humanoid Robots

Pranay Dugar; Aayam Shrestha; Fangzhou Yu; Bart van Marum; Alan Fern

doi:10.1609/aaaiss.v7i1.36946

Authors

Pranay Dugar Oregon State University
Aayam Shrestha Oregon State University
Fangzhou Yu Oregon State University
Bart van Marum Oregon State University
Alan Fern Oregon State University

DOI:

https://doi.org/10.1609/aaaiss.v7i1.36946

Abstract

A major challenge in humanoid robotics is designing a unified interface for commanding diverse whole-body behaviors, from precise footstep sequences to partial-body mimicry and joystick teleoperation. We introduce the Masked Humanoid Controller (MHC), a learned whole-body controller that exposes a simple yet expressive interface: the specification of masked target trajectories over selected subsets of the robot’s state variables. This unified abstraction allows high-level systems to issue commands in a flexible format that accommodates multi-modal inputs such as optimized trajectories, motion capture clips, re-targeted video, and real-time joystick signals. The MHC is trained in simulation using a curriculum that spans this full range of modalities, enabling robust execution of partially specified behaviors while maintaining balance and disturbance rejection. We demonstrate the MHC both in simulation and on the real-world Digit V3 humanoid, showing that a single learned controller is capable of executing such diverse whole-body commands in the real world through a common representational interface.

Learning Multi-Modal Whole-Body Control for Real-World Humanoid Robots

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information