ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.16587
51
0

Human2Robot: Learning Robot Actions from Paired Human-Robot Videos

23 February 2025
Sicheng Xie
Haidong Cao
Zejia Weng
Zhen Xing
Shiwei Shen
Jiaqi Leng
Xipeng Qiu
Yanwei Fu
Zuxuan Wu
Yu Jiang
ArXivPDFHTML
Abstract

Distilling knowledge from human demonstrations is a promising way for robots to learn and act. Existing work often overlooks the differences between humans and robots, producing unsatisfactory results. In this paper, we study how perfectly aligned human-robot pairs benefit robot learning. Capitalizing on VR-based teleportation, we introduce H\&R, a third-person dataset with 2,600 episodes, each of which captures the fine-grained correspondence between human hand and robot gripper. Inspired by the recent success of diffusion models, we introduce Human2Robot, an end-to-end diffusion framework that formulates learning from human demonstration as a generative task. Human2Robot fully explores temporal dynamics in human videos to generate robot videos and predict actions at the same time. Through comprehensive evaluations of 4 carefully selected tasks in real-world settings, we demonstrate that Human2Robot can not only generate high-quality robot videos but also excels in seen tasks and generalizing to different positions, unseen appearances, novel instances, and even new backgrounds and task types.

View on arXiv
@article{xie2025_2502.16587,
  title={ Human2Robot: Learning Robot Actions from Paired Human-Robot Videos },
  author={ Sicheng Xie and Haidong Cao and Zejia Weng and Zhen Xing and Shiwei Shen and Jiaqi Leng and Xipeng Qiu and Yanwei Fu and Zuxuan Wu and Yu-Gang Jiang },
  journal={arXiv preprint arXiv:2502.16587},
  year={ 2025 }
}
Comments on this paper