Language-Grounded Hierarchical Planning and Execution with Multi-Robot 3D Scene Graphs

9 June 2025

Main:9 Pages

4 Figures

Bibliography:3 Pages

4 Tables

Abstract

In this paper, we introduce a multi-robot system that integrates mapping, localization, and task and motion planning (TAMP) enabled by 3D scene graphs to execute complex instructions expressed in natural language. Our system builds a shared 3D scene graph incorporating an open-set object-based map, which is leveraged for multi-robot 3D scene graph fusion. This representation supports real-time, view-invariant relocalization (via the object-based map) and planning (via the 3D scene graph), allowing a team of robots to reason about their surroundings and execute complex tasks. Additionally, we introduce a planning approach that translates operator intent into Planning Domain Definition Language (PDDL) goals using a Large Language Model (LLM) by leveraging context from the shared 3D scene graph and robot capabilities. We provide an experimental assessment of the performance of our system on real-world tasks in large-scale, outdoor environments.

View on arXiv

@article{strader2025_2506.07454,
  title={ Language-Grounded Hierarchical Planning and Execution with Multi-Robot 3D Scene Graphs },
  author={ Jared Strader and Aaron Ray and Jacob Arkin and Mason B. Peterson and Yun Chang and Nathan Hughes and Christopher Bradley and Yi Xuan Jia and Carlos Nieto-Granda and Rajat Talak and Chuchu Fan and Luca Carlone and Jonathan P. How and Nicholas Roy },
  journal={arXiv preprint arXiv:2506.07454},
  year={ 2025 }
}

Comments on this paper