*Stones are randomly distributed, each 20 cm in size (≈ feet length), with a maximum distance of 45 cm and an average distance of 35 cm.
*Stones are randomly distributed, each 20 cm in size (≈ feet length), with a maximum distance of 45 cm and an average distance of 35 cm.
*The beam width is 20 cm.
*BeamDojo showcases zero-shot transfer to gaps and stepping beams, and demonstrates robustness to missteps. The gap width is 50 cm.
Traversing risky terrains with sparse footholds poses a significant challenge for humanoid robots, requiring precise foot placements and stable locomotion. Existing approaches designed for quadrupedal robots often fail to generalize to humanoid robots due to differences in foot geometry and unstable morphology, while learning-based approaches for humanoid locomotion still face great challenges on complex terrains due to sparse foothold reward signals and inefficient learning processes. To address these challenges, we introduce BeamDojo, a reinforcement learning (RL) framework designed for enabling agile humanoid locomotion on sparse footholds. BeamDojo begins by introducing a sampling-based foothold reward tailored for polygonal feet, along with a double critic to balancing the learning process between dense locomotion rewards and sparse foothold rewards. To encourage sufficient trail-and-error exploration, BeamDojo incorporates a two-stage RL approach: the first stage relaxes the terrain dynamics by training the humanoid on flat terrain while providing it with task terrain perceptive observations, and the second stage fine-tunes the policy on the actual task terrain. Moreover, we implement a onboard LiDAR-based elevation map to enable real-world deployment. Extensive simulation and real-world experiments demonstrate that BeamDojo achieves efficient learning in simulation and enables agile locomotion with precise foot placement on sparse footholds in the real world, maintaining a high success rate even under significant external disturbances.
(a) Training in Simulation. BeamDojo incorporates a two-stage RL approach.
(b) Deployment. The robot-centric elevation map, reconstructed using LiDAR data, is combined with proprioceptive information to serve as the input for the actor.
Many excellent works inspire the design of BeamDojo.
@article{wang2025beamdojo,
title = {BeamDojo: Learning Agile Humanoid Locomotion on Sparse Footholds},
author = {Wang, Huayi and Wang, Zirui and Ren, Junli and Ben, Qingwei and Huang, Tao and Zhang, Weinan and Pang, Jiangmiao},
journal = {arXiv preprint arXiv:2502.10363},
year = {2025},
}