ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.08643
  4. Cited By
A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards

A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards

12 February 2025
Shivansh Patel
Xinchen Yin
Wenlong Huang
Shubham Garg
H. Nayyeri
Li Fei-Fei
Svetlana Lazebnik
Yongqian Li
ArXivPDFHTML

Papers citing "A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards"

41 / 91 papers shown
Title
DexPoint: Generalizable Point Cloud Reinforcement Learning for
  Sim-to-Real Dexterous Manipulation
DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation
Yuzhe Qin
Binghao Huang
Zhao-Heng Yin
Hao Su
Xiaolong Wang
3DPC
39
78
0
17 Nov 2022
VIMA: General Robot Manipulation with Multimodal Prompts
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
57
343
0
06 Oct 2022
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned
  from Images
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images
Jun Gao
Tianchang Shen
Zian Wang
Wenzheng Chen
K. Yin
Daiqing Li
Or Litany
Zan Gojcic
Sanja Fidler
57
442
0
22 Sep 2022
Multi-skill Mobile Manipulation for Object Rearrangement
Multi-skill Mobile Manipulation for Object Rearrangement
Jiayuan Gu
Devendra Singh Chaplot
Hao Su
Jitendra Malik
57
53
0
06 Sep 2022
Simple Open-Vocabulary Object Detection with Vision Transformers
Simple Open-Vocabulary Object Detection with Vision Transformers
Matthias Minderer
A. Gritsenko
Austin Stone
Maxim Neumann
Dirk Weissenborn
...
Zhuoran Shen
Tianlin Li
Xiaohua Zhai
Thomas Kipf
N. Houlsby
ObjD
CLIP
VLM
ViT
OCL
72
310
0
12 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
104
1,279
0
04 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
230
3,458
0
29 Apr 2022
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
Michael Ahn
Anthony Brohan
Noah Brown
Yevgen Chebotar
Omar Cortes
...
Ted Xiao
Peng Xu
Sichun Xu
Mengyuan Yan
Andy Zeng
LM&Ro
95
1,901
0
04 Apr 2022
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Andy Zeng
Maria Attarian
Brian Ichter
K. Choromanski
Adrian S. Wong
...
Michael S. Ryoo
Vikas Sindhwani
Johnny Lee
Vincent Vanhoucke
Peter R. Florence
ReLM
LRM
111
577
0
01 Apr 2022
Ditto: Building Digital Twins of Articulated Objects from Interaction
Ditto: Building Digital Twins of Articulated Objects from Interaction
Zhenyu Jiang
Cheng-Chun Hsu
Yuke Zhu
17
104
0
16 Feb 2022
Bayesian Imitation Learning for End-to-End Mobile Manipulation
Bayesian Imitation Learning for End-to-End Mobile Manipulation
Yuqing Du
Daniel Ho
Alexander A. Alemi
Eric Jang
Mohi Khansari
SSL
34
10
0
15 Feb 2022
You Only Demonstrate Once: Category-Level Manipulation from Single
  Visual Demonstration
You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration
Bowen Wen
Wenzhao Lian
Kostas Bekris
S. Schaal
36
91
0
30 Jan 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
426
4,283
0
28 Jan 2022
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge
  for Embodied Agents
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
Wenlong Huang
Pieter Abbeel
Deepak Pathak
Igor Mordatch
LM&Ro
56
1,078
0
18 Jan 2022
CLIPort: What and Where Pathways for Robotic Manipulation
CLIPort: What and Where Pathways for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
Dieter Fox
LM&Ro
80
640
0
24 Sep 2021
CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from
  Simulation
CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation
Bowen Wen
Wenzhao Lian
Kostas Bekris
S. Schaal
39
75
0
19 Sep 2021
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot
  Learning
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
Viktor Makoviychuk
Lukasz Wawrzyniak
Yunrong Guo
Michelle Lu
Kier Storey
...
David Hoeller
Nikita Rudin
Arthur Allshire
Ankur Handa
Gavriel State
108
1,049
0
24 Aug 2021
BayesSimIG: Scalable Parameter Inference for Adaptive Domain
  Randomization with IsaacGym
BayesSimIG: Scalable Parameter Inference for Adaptive Domain Randomization with IsaacGym
Rika Antonova
Fabio Ramos
Rafael Possas
Dieter Fox
28
4
0
09 Jul 2021
RMA: Rapid Motor Adaptation for Legged Robots
RMA: Rapid Motor Adaptation for Legged Robots
Ashish Kumar
Zipeng Fu
Deepak Pathak
Jitendra Malik
114
564
0
08 Jul 2021
A-SDF: Learning Disentangled Signed Distance Functions for Articulated
  Shape Representation
A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation
Jiteng Mu
Weichao Qiu
Adam Kortylewski
Alan Yuille
Nuno Vasconcelos
Xiaolong Wang
63
106
0
15 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
537
28,659
0
26 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
376
3,778
0
11 Feb 2021
Learning Quadrupedal Locomotion over Challenging Terrain
Learning Quadrupedal Locomotion over Challenging Terrain
Joonho Lee
Jemin Hwangbo
Lorenz Wellhausen
V. Koltun
Marco Hutter
81
1,154
0
21 Oct 2020
RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real
RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real
Kanishka Rao
Chris Harris
A. Irpan
Sergey Levine
Julian Ibarz
Mohi Khansari
56
187
0
16 Jun 2020
Sim2Real2Sim: Bridging the Gap Between Simulation and Real-World in
  Flexible Object Manipulation
Sim2Real2Sim: Bridging the Gap Between Simulation and Real-World in Flexible Object Manipulation
Peng Chang
T. Padır
25
40
0
06 Feb 2020
Solving Rubik's Cube with a Robot Hand
Solving Rubik's Cube with a Robot Hand
OpenAI
Ilge Akkaya
Marcin Andrychowicz
Maciek Chociej
Ma-teusz Litwin
...
Peter Welinder
Lilian Weng
Qiming Yuan
Wojciech Zaremba
Lei Zhang
ODL
53
1,215
0
16 Oct 2019
Meta Reinforcement Learning for Sim-to-real Domain Adaptation
Meta Reinforcement Learning for Sim-to-real Domain Adaptation
Karol Arndt
Murtaza Hazara
Ali Ghadirzadeh
Ville Kyrki
136
104
0
16 Sep 2019
Shape2Motion: Joint Analysis of Motion Parts and Attributes from 3D
  Shapes
Shape2Motion: Joint Analysis of Motion Parts and Attributes from 3D Shapes
Xiaogang Wang
Bin Zhou
Yahao Shi
Xiaowu Chen
Qinping Zhao
Kai Xu
3DPC
45
127
0
10 Mar 2019
Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via
  Randomized-to-Canonical Adaptation Networks
Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks
Stephen James
Paul Wohlhart
Mrinal Kalakrishnan
Dmitry Kalashnikov
A. Irpan
Julian Ibarz
Sergey Levine
R. Hadsell
Konstantinos Bousmalis
65
446
0
18 Dec 2018
Closing the Sim-to-Real Loop: Adapting Simulation Randomization with
  Real World Experience
Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience
Yevgen Chebotar
Ankur Handa
Viktor Makoviychuk
Miles Macklin
J. Issac
Nathan D. Ratliff
Dieter Fox
61
503
0
12 Oct 2018
Sim-to-Real: Learning Agile Locomotion For Quadruped Robots
Sim-to-Real: Learning Agile Locomotion For Quadruped Robots
Jie Tan
Tingnan Zhang
Erwin Coumans
Atil Iscen
Yunfei Bai
Danijar Hafner
Steven Bohez
Vincent Vanhoucke
65
798
0
27 Apr 2018
Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
Xue Bin Peng
Marcin Andrychowicz
Wojciech Zaremba
Pieter Abbeel
78
1,355
0
18 Oct 2017
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning
  and Demonstrations
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
Aravind Rajeswaran
Vikash Kumar
Abhishek Gupta
Giulia Vezzani
John Schulman
E. Todorov
Sergey Levine
85
1,079
0
28 Sep 2017
Using Simulation and Domain Adaptation to Improve Efficiency of Deep
  Robotic Grasping
Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
Konstantinos Bousmalis
A. Irpan
Paul Wohlhart
Yunfei Bai
Matthew Kelcey
...
Julian Ibarz
P. Pastor
K. Konolige
Sergey Levine
Vincent Vanhoucke
OOD
62
654
0
22 Sep 2017
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics
  Problems with Sparse Rewards
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
Matej Vecerík
Todd Hester
Jonathan Scholz
Fumin Wang
Olivier Pietquin
Bilal Piot
N. Heess
Thomas Rothörl
Thomas Lampe
Martin Riedmiller
OffRL
49
661
0
27 Jul 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
185
18,685
0
20 Jul 2017
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
I. Popov
N. Heess
Timothy Lillicrap
Roland Hafner
Gabriel Barth-Maron
Matej Vecerík
Thomas Lampe
Yuval Tassa
Tom Erez
Martin Riedmiller
OffRL
46
264
0
10 Apr 2017
Domain Randomization for Transferring Deep Neural Networks from
  Simulation to the Real World
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
Joshua Tobin
Rachel Fong
Alex Ray
Jonas Schneider
Wojciech Zaremba
Pieter Abbeel
130
2,948
0
20 Mar 2017
Real-time Perception meets Reactive Motion Generation
Real-time Perception meets Reactive Motion Generation
Daniel Kappler
Franziska Meier
J. Issac
Jim Mainprice
C. Cifuentes
Manuel Wüthrich
V. Berenz
S. Schaal
Nathan D. Ratliff
Jeannette Bohg
25
98
0
10 Mar 2017
Deep Reinforcement Learning for Robotic Manipulation with Asynchronous
  Off-Policy Updates
Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates
S. Gu
E. Holly
Timothy Lillicrap
Sergey Levine
OffRL
SSL
82
1,474
0
03 Oct 2016
Fast and Accurate Deep Network Learning by Exponential Linear Units
  (ELUs)
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Djork-Arné Clevert
Thomas Unterthiner
Sepp Hochreiter
184
5,502
0
23 Nov 2015
Previous
12