ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.18947
58
0

OpenHOI: Open-World Hand-Object Interaction Synthesis with Multimodal Large Language Model

25 May 2025
Zhenhao Zhang
Ye-ling Shi
Lingxiao Yang
Suting Ni
Qi Ye
Jingya Wang
ArXiv (abs)PDFHTML
Main:10 Pages
5 Figures
Bibliography:3 Pages
12 Tables
Appendix:8 Pages
Abstract

Understanding and synthesizing realistic 3D hand-object interactions (HOI) is critical for applications ranging from immersive AR/VR to dexterous robotics. Existing methods struggle with generalization, performing well on closed-set objects and predefined tasks but failing to handle unseen objects or open-vocabulary instructions. We introduce OpenHOI, the first framework for open-world HOI synthesis, capable of generating long-horizon manipulation sequences for novel objects guided by free-form language commands. Our approach integrates a 3D Multimodal Large Language Model (MLLM) fine-tuned for joint affordance grounding and semantic task decomposition, enabling precise localization of interaction regions (e.g., handles, buttons) and breakdown of complex instructions (e.g., "Find a water bottle and take a sip") into executable sub-tasks. To synthesize physically plausible interactions, we propose an affordance-driven diffusion model paired with a training-free physics refinement stage that minimizes penetration and optimizes affordance alignment. Evaluations across diverse scenarios demonstrate OpenHOI's superiority over state-of-the-art methods in generalizing to novel object categories, multi-stage tasks, and complex language instructions. Our project page at \href{this https URL}

View on arXiv
@article{zhang2025_2505.18947,
  title={ OpenHOI: Open-World Hand-Object Interaction Synthesis with Multimodal Large Language Model },
  author={ Zhenhao Zhang and Ye Shi and Lingxiao Yang and Suting Ni and Qi Ye and Jingya Wang },
  journal={arXiv preprint arXiv:2505.18947},
  year={ 2025 }
}
Comments on this paper