ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes

17 June 2025

Main:8 Pages

12 Figures

Bibliography:4 Pages

6 Tables

Appendix:8 Pages

Abstract

Dexterous grasping in cluttered scenes presents significant challenges due to diverse object geometries, occlusions, and potential collisions. Existing methods primarily focus on single-object grasping or grasp-pose prediction without interaction, which are insufficient for complex, cluttered scenes. Recent vision-language-action models offer a potential solution but require extensive real-world demonstrations, making them costly and difficult to scale. To address these limitations, we revisit the sim-to-real transfer pipeline and develop key techniques that enable zero-shot deployment in reality while maintaining robust generalization. We propose ClutterDexGrasp, a two-stage teacher-student framework for closed-loop target-oriented dexterous grasping in cluttered scenes. The framework features a teacher policy trained in simulation using clutter density curriculum learning, incorporating both a novel geometry and spatially-embedded scene representation and a comprehensive safety curriculum, enabling general, dynamic, and safe grasping behaviors. Through imitation learning, we distill the teacher's knowledge into a student 3D diffusion policy (DP3) that operates on partial point cloud observations. To the best of our knowledge, this represents the first zero-shot sim-to-real closed-loop system for target-oriented dexterous grasping in cluttered scenes, demonstrating robust performance across diverse objects and layouts. More details and videos are available atthis https URL.

View on arXiv

@article{chen2025_2506.14317,
  title={ ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes },
  author={ Zeyuan Chen and Qiyang Yan and Yuanpei Chen and Tianhao Wu and Jiyao Zhang and Zihan Ding and Jinzhou Li and Yaodong Yang and Hao Dong },
  journal={arXiv preprint arXiv:2506.14317},
  year={ 2025 }
}

Comments on this paper