Gait in Eight: Efficient On-Robot Learning for Omnidirectional Quadruped Locomotion
On-robot Reinforcement Learning is a promising approach to train embodiment-aware policies for legged robots. However, the computational constraints of real-time learning on robots pose a significant challenge. We present a framework for efficiently learning quadruped locomotion in just 8 minutes of raw real-time training utilizing the sample efficiency and minimal computational overhead of the new off-policy algorithm CrossQ. We investigate two control architectures: Predicting joint target positions for agile, high-speed locomotion and Central Pattern Generators for stable, natural gaits. While prior work focused on learning simple forward gaits, our framework extends on-robot learning to omnidirectional locomotion. We demonstrate the robustness of our approach in different indoor and outdoor environments.
View on arXiv@article{bohlinger2025_2503.08375, title={ Gait in Eight: Efficient On-Robot Learning for Omnidirectional Quadruped Locomotion }, author={ Nico Bohlinger and Jonathan Kinzel and Daniel Palenicek and Lukasz Antczak and Jan Peters }, journal={arXiv preprint arXiv:2503.08375}, year={ 2025 } }