v1v2 (latest)

Toward Reliable Sim-to-Real Predictability for MoE-based Robust Quadrupedal Locomotion

27 March 2026

Tianyang Wu

Hanwei Guo

Yuhang Wang

Junshu Yang

Xinyang Sui

Jiayi Xie

Xingyu Chen

Zeyang Liu

Xuguang Lan

MoE

ArXiv (abs)PDF HTML Github

Main:8 Pages

18 Figures

Bibliography:3 Pages

14 Tables

Appendix:6 Pages

Abstract

Reinforcement learning has shown strong promise for quadrupedal agile locomotion, even with proprioception-only sensing. In practice, however, sim-to-real gap and reward overfitting in complex terrains can produce policies that fail to transfer, while physical validation remains risky and inefficient. To address these challenges, we introduce a unified framework encompassing a Mixture-of-Experts (MoE) locomotion policy for robust multi-terrain representation with RoboGauge, a predictive assessment suite that quantifies sim-to-real transferability. The MoE policy employs a gated set of specialist experts to decompose latent terrain and command modeling, achieving superior deployment robustness and generalization via proprioception alone. RoboGauge further provides multi-dimensional proprioception-based metrics via sim-to-sim tests over terrains, difficulty levels, and domain randomizations, enabling reliable MoE policy selection without extensive physical trials. Experiments on a Unitree Go2 demonstrate robust locomotion on unseen challenging terrains, including snow, sand, stairs, slopes, and 30 cm obstacles. In dedicated high-speed tests, the robot reaches 4 m/s and exhibits an emergent narrow-width gait associated with improved stability at high velocity.

View on arXiv

Comments on this paper