Safe Active Dynamics Learning and Control: A Sequential
Exploration-Exploitation Framework
Safe deployment of autonomous robots in diverse scenarios requires agents that are capable of efficiently adapting to new environments while satisfying constraints. In this work, we propose a practical and theoretically-justified approach to maintaining safety in the presence of dynamics uncertainty. Our approach leverages Bayesian meta-learning with last-layer adaptation: the expressiveness of neural-network features trained offline, paired with efficient last-layer online adaptation, enables the derivation of tight confidence sets which contract around the true dynamics as the model adapts online. We exploit these confidence sets to plan trajectories that guarantee the safety of the system. Our approach handles problems with high dynamics uncertainty where reaching the goal safely is initially infeasible by first exploring to gather data and reduce uncertainty, before autonomously exploiting the acquired information to safely perform the task. Under reasonable assumptions, we prove that our framework has high-probability guarantees of satisfying all constraints at all times jointly. This analysis also motivates two regularizers of last-layer meta-learners that improve online adaptation capabilities as well as performance by reducing the size of the confidence sets. We extensively demonstrate our approach in simulation and on hardware.
View on arXiv