An Architecture for Unattended Containerized (Deep) Reinforcement Learning with Webots

6 February 2024

Abstract

As data science applications gain adoption across industries, the tooling landscape matures to facilitate the life cycle of such applications and provide solutions to the challenges involved to boost the productivity of the people involved. Reinforcement learning with agents in a 3D world could still face challenges: the knowledge required to use a simulation software as well as the utilization of a standalone simulation software in unattended training pipelines. In this paper we review tools and approaches to train reinforcement learning agents for robots in 3D worlds with respect to the robot Robotino and argue that the separation of the simulation environment for creators of virtual worlds and the model development environment for data scientists is not a well covered topic. Often both are the same and data scientists require knowledge of the simulation software to work directly with their APIs. Moreover, sometimes creators of virtual worlds and data scientists even work on the same files. We want to contribute to that topic by describing an approach where data scientists don't require knowledge about the simulation software. Our approach uses the standalone simulation software Webots, the Robot Operating System to communicate with simulated robots as well as the simulation software itself and container technology to separate the simulation from the model development environment. We put emphasize on the APIs the data scientists work with and the use of a standalone simulation software in unattended training pipelines. We show the parts that are specific to the Robotino and the robot task to learn.

View on arXiv

Comments on this paper