ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.08643
  4. Cited By
A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards

A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards

12 February 2025
Shivansh Patel
Xinchen Yin
Wenlong Huang
Shubham Garg
H. Nayyeri
Li Fei-Fei
Svetlana Lazebnik
Yongqian Li
ArXivPDFHTML

Papers citing "A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards"

50 / 91 papers shown
Title
Eurekaverse: Environment Curriculum Generation via Large Language Models
Eurekaverse: Environment Curriculum Generation via Large Language Models
William Liang
Sam Wang
Hung-Ju Wang
Osbert Bastani
Dinesh Jayaraman
Yecheng Jason Ma
SyDa
81
2
0
04 Nov 2024
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Jiayi Liu
Denys Iliash
Angel X. Chang
Manolis Savva
Ali Mahdavi-Amiri
93
9
0
21 Oct 2024
Guiding Long-Horizon Task and Motion Planning with Vision Language
  Models
Guiding Long-Horizon Task and Motion Planning with Vision Language Models
Zhutian Yang
Caelan Reed Garrett
Dieter Fox
Tomás Lozano-Pérez
Leslie Pack Kaelbling
LM&Ro
82
17
0
03 Oct 2024
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures
  in Robotic Manipulation
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
Jiafei Duan
Wilbert Pumacay
Nishanth Kumar
Yi Ru Wang
Shulin Tian
Wentao Yuan
Ranjay Krishna
Dieter Fox
Ajay Mandlekar
Yijie Guo
VLM
LRM
82
23
0
01 Oct 2024
KALIE: Fine-Tuning Vision-Language Models for Open-World Manipulation
  without Robot Data
KALIE: Fine-Tuning Vision-Language Models for Open-World Manipulation without Robot Data
Grace Tang
Swetha Rajkumar
Yifei Zhou
Homer Walke
Sergey Levine
Kuan Fang
LM&Ro
VLM
36
8
0
21 Sep 2024
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for
  Robotic Manipulation
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation
Wenlong Huang
Chen Wang
Yongqian Li
Ruohan Zhang
Li Fei-Fei
84
101
0
03 Sep 2024
SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse
  Views
SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views
Chao Xu
Ang Li
Linghao Chen
Yulin Liu
Ruoxi Shi
Hao Su
Minghua Liu
3DGS
79
21
0
19 Aug 2024
VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation
VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation
Wentao Zhao
Jiaming Chen
Ziyu Meng
Donghui Mao
Ran Song
Wei Zhang
87
9
0
13 Jul 2024
Robotic Control via Embodied Chain-of-Thought Reasoning
Robotic Control via Embodied Chain-of-Thought Reasoning
Michał Zawalski
William Chen
Karl Pertsch
Oier Mees
Chelsea Finn
Sergey Levine
LRM
LM&Ro
88
69
0
11 Jul 2024
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for
  Robotics
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics
Wentao Yuan
Jiafei Duan
Valts Blukis
Wilbert Pumacay
Ranjay Krishna
Adithyavairavan Murali
Arsalan Mousavian
Dieter Fox
LM&Ro
59
56
0
15 Jun 2024
Real2Code: Reconstruct Articulated Objects via Code Generation
Real2Code: Reconstruct Articulated Objects via Code Generation
Zhao Mandi
Yijia Weng
Dominik Bauer
Shuran Song
70
17
0
12 Jun 2024
DrEureka: Language Model Guided Sim-To-Real Transfer
DrEureka: Language Model Guided Sim-To-Real Transfer
Yecheng Jason Ma
William Liang
Hung-Ju Wang
Sam Wang
Yuke Zhu
Linxi Fan
Osbert Bastani
Dinesh Jayaraman
99
43
0
04 Jun 2024
Octo: An Open-Source Generalist Robot Policy
Octo: An Open-Source Generalist Robot Policy
Octo Model Team
Dibya Ghosh
Homer Walke
Karl Pertsch
Kevin Black
...
Quan Vuong
Ted Xiao
Dorsa Sadigh
Chelsea Finn
Sergey Levine
118
392
0
20 May 2024
URDFormer: A Pipeline for Constructing Articulated Simulation
  Environments from Real-World Images
URDFormer: A Pipeline for Constructing Articulated Simulation Environments from Real-World Images
Zoey Chen
Aaron Walsman
Marius Memmel
Kaichun Mo
Alex Fang
Karthikeya Vemuri
Alan Wu
Dieter Fox
Abhishek Gupta
AI4CE
VGen
70
28
0
19 May 2024
TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction
TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction
Yunfan Jiang
Chen Wang
Ruohan Zhang
Jiajun Wu
Fei-Fei Li
OnRL
54
26
0
16 May 2024
Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics
Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics
Norman Di Palo
Edward Johns
74
33
0
28 Mar 2024
CoPa: General Robotic Manipulation through Spatial Constraints of Parts
  with Foundation Models
CoPa: General Robotic Manipulation through Spatial Constraints of Parts with Foundation Models
Haoxu Huang
Fanqi Lin
Yingdong Hu
Shengjie Wang
Yang Gao
72
53
0
13 Mar 2024
Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach
  for Robust Manipulation
Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation
M. Torné
Anthony Simeonov
Zechu Li
April Chan
Tao Chen
Abhishek Gupta
Pulkit Agrawal
57
61
0
06 Mar 2024
RT-H: Action Hierarchies Using Language
RT-H: Action Hierarchies Using Language
Suneel Belkhale
Tianli Ding
Ted Xiao
P. Sermanet
Quon Vuong
Jonathan Tompson
Yevgen Chebotar
Debidatta Dwibedi
Dorsa Sadigh
LM&Ro
52
81
0
04 Mar 2024
Learning to Learn Faster from Human Feedback with Language Model
  Predictive Control
Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Jacky Liang
Fei Xia
Wenhao Yu
Andy Zeng
Montse Gonzalez Arenas
...
N. Heess
Kanishka Rao
Nik Stewart
Jie Tan
Carolina Parada
LM&Ro
73
35
0
18 Feb 2024
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Soroush Nasiriany
Fei Xia
Wenhao Yu
Ted Xiao
Jacky Liang
...
Karol Hausman
N. Heess
Chelsea Finn
Sergey Levine
Brian Ichter
LM&Ro
LRM
38
98
0
12 Feb 2024
Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion
Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion
Tairan He
Chong Zhang
Wenli Xiao
Guanqi He
Changliu Liu
Guanya Shi
61
65
0
31 Jan 2024
Generative Expressive Robot Behaviors using Large Language Models
Generative Expressive Robot Behaviors using Large Language Models
Karthik Mahadevan
Jonathan M. Chien
Noah Brown
Zhuo Xu
Carolina Parada
Fei Xia
Andy Zeng
Leila Takayama
Dorsa Sadigh
LM&Ro
56
38
0
26 Jan 2024
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning
  Capabilities
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities
Boyuan Chen
Zhuo Xu
Sean Kirmani
Brian Ichter
Danny Driess
Pete Florence
Dorsa Sadigh
Leonidas Guibas
Fei Xia
LRM
ReLM
57
231
0
22 Jan 2024
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
Bowen Wen
Wei Yang
Jan Kautz
Stanley T. Birchfield
45
190
0
13 Dec 2023
Distilling and Retrieving Generalizable Knowledge for Robot Manipulation
  via Language Corrections
Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections
Lihan Zha
Yuchen Cui
Li-Heng Lin
Minae Kwon
Montse Gonzalez Arenas
Andy Zeng
Fei Xia
Dorsa Sadigh
55
36
0
17 Nov 2023
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View
  Generation and 3D Diffusion
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
Minghua Liu
Ruoxi Shi
Linghao Chen
Zhuoyang Zhang
Chao Xu
Xinyue Wei
Hansheng Chen
Chong Zeng
Jiayuan Gu
Hao Su
63
198
0
14 Nov 2023
Large Language Models for Robotics: A Survey
Large Language Models for Robotics: A Survey
Fanlong Zeng
Wensheng Gan
Yongheng Wang
Ning Liu
Philip S. Yu
LM&Ro
135
133
0
13 Nov 2023
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model
Ruoxi Shi
Hansheng Chen
Zhuoyang Zhang
Minghua Liu
Chao Xu
Xinyue Wei
Linghao Chen
Chong Zeng
Hao Su
VLM
41
351
0
23 Oct 2023
Creative Robot Tool Use with Large Language Models
Creative Robot Tool Use with Large Language Models
Mengdi Xu
Peide Huang
Wenhao Yu
Shiqi Liu
Xilun Zhang
Yaru Niu
Tingnan Zhang
Fei Xia
Jie Tan
Ding Zhao
LM&Ro
LLMAG
71
39
0
19 Oct 2023
Eureka: Human-Level Reward Design via Coding Large Language Models
Eureka: Human-Level Reward Design via Coding Large Language Models
Yecheng Jason Ma
William Liang
Guanzhi Wang
De-An Huang
Osbert Bastani
Dinesh Jayaraman
Yuke Zhu
Linxi Fan
A. Anandkumar
42
304
0
19 Oct 2023
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment Collaboration
Abby OÑeill
Abdul Rehman
Abhinav Gupta
Abhiram Maddukuri
...
Zhuo Xu
Zichen Jeff Cui
Zichen Zhang
Zipeng Fu
Zipeng Lin
LM&Ro
102
487
0
13 Oct 2023
Generalizable Long-Horizon Manipulations with Large Language Models
Generalizable Long-Horizon Manipulations with Large Language Models
Haoyu Zhou
Mingyu Ding
Weikun Peng
Masayoshi Tomizuka
Lin Shao
Chuang Gan
LM&Ro
23
14
0
03 Oct 2023
Dynamic Handover: Throw and Catch with Bimanual Hands
Dynamic Handover: Throw and Catch with Bimanual Hands
Binghao Huang
Yuanpei Chen
Tianyu Wang
Yuzhe Qin
Yaodong Yang
Nikolay Atanasov
Xiaolong Wang
20
38
0
11 Sep 2023
Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon
  Manipulation
Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation
Yuanpei Chen
Chen Wang
Fei-Fei Li
Chenxi Liu
39
42
0
02 Sep 2023
PARIS: Part-level Reconstruction and Motion Analysis for Articulated
  Objects
PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects
Jiayi Liu
Ali Mahdavi-Amiri
Manolis Savva
3DPC
55
38
0
14 Aug 2023
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic
  Control
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
LRM
70
1,172
0
28 Jul 2023
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with
  Language Models
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
Wenlong Huang
Chen Wang
Ruohan Zhang
Yunzhu Li
Jiajun Wu
Li Fei-Fei
LM&Ro
59
488
0
12 Jul 2023
Robots That Ask For Help: Uncertainty Alignment for Large Language Model
  Planners
Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners
Allen Z. Ren
Anushri Dixit
Alexandra Bodrova
Sumeet Singh
Stephen Tu
...
Jacob Varley
Zhenjia Xu
Dorsa Sadigh
Andy Zeng
Anirudha Majumdar
LM&Ro
174
224
0
04 Jul 2023
HomeRobot: Open-Vocabulary Mobile Manipulation
HomeRobot: Open-Vocabulary Mobile Manipulation
Sriram Yenamandra
A. Ramachandran
Karmesh Yadav
Austin S. Wang
Mukul Khanna
...
Devendra Singh Chaplot
Dhruv Batra
Roozbeh Mottaghi
Yonatan Bisk
Chris Paxton
LM&Ro
76
80
0
20 Jun 2023
Language to Rewards for Robotic Skill Synthesis
Language to Rewards for Robotic Skill Synthesis
Wenhao Yu
Nimrod Gileadi
Chuyuan Fu
Sean Kirmani
Kuang-Huei Lee
...
N. Heess
Dorsa Sadigh
Jie Tan
Yuval Tassa
F. Xia
LM&Ro
61
273
0
14 Jun 2023
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions
  with Large Language Model
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
Siyuan Huang
Zhengkai Jiang
Hao Dong
Yu Qiao
Peng Gao
Hongsheng Li
LM&Ro
54
93
0
18 May 2023
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown
  Objects
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
Bowen Wen
Jonathan Tremblay
Valts Blukis
Stephen Tyree
Thomas Müller
Alex Evans
Dieter Fox
Jan Kautz
Stan Birchfield
3DH
102
131
0
24 Mar 2023
Zero-1-to-3: Zero-shot One Image to 3D Object
Zero-1-to-3: Zero-shot One Image to 3D Object
Ruoshi Liu
Rundi Wu
Basile Van Hoorick
P. Tokmakov
Sergey Zakharov
Carl Vondrick
DiffM
51
1,064
0
20 Mar 2023
Rotating without Seeing: Towards In-hand Dexterity through Touch
Rotating without Seeing: Towards In-hand Dexterity through Touch
Zhao-Heng Yin
Binghao Huang
Yuzhe Qin
Qifeng Chen
Xiaolong Wang
124
95
0
20 Mar 2023
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
366
13,788
0
15 Mar 2023
Grounded Decoding: Guiding Text Generation with Grounded Models for
  Embodied Agents
Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents
Wenlong Huang
Fei Xia
Dhruv Shah
Danny Driess
Andy Zeng
...
Pete Florence
Igor Mordatch
Sergey Levine
Karol Hausman
Brian Ichter
LM&Ro
57
46
0
01 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
365
4,406
0
30 Jan 2023
AnyGrasp: Robust and Efficient Grasp Perception in Spatial and Temporal
  Domains
AnyGrasp: Robust and Efficient Grasp Perception in Spatial and Temporal Domains
Haoshu Fang
Chenxi Wang
Hongjie Fang
Minghao Gou
Jirong Liu
Hengxu Yan
Wenhai Liu
Yichen Xie
Cewu Lu
56
200
0
16 Dec 2022
RT-1: Robotics Transformer for Real-World Control at Scale
RT-1: Robotics Transformer for Real-World Control at Scale
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Joseph Dabis
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
54
1,068
0
13 Dec 2022
12
Next