ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.00905
  4. Cited By
Open-World Object Manipulation using Pre-trained Vision-Language Models

Open-World Object Manipulation using Pre-trained Vision-Language Models

2 March 2023
Austin Stone
Ted Xiao
Yao Lu
K. Gopalakrishnan
Kuang-Huei Lee
Q. Vuong
Paul Wohlhart
Sean Kirmani
Brianna Zitkovich
F. Xia
Chelsea Finn
Karol Hausman
    LM&Ro
ArXivPDFHTML

Papers citing "Open-World Object Manipulation using Pre-trained Vision-Language Models"

50 / 108 papers shown
Title
Octo: An Open-Source Generalist Robot Policy
Octo: An Open-Source Generalist Robot Policy
Octo Model Team
Dibya Ghosh
Homer Walke
Karl Pertsch
Kevin Black
...
Quan Vuong
Ted Xiao
Dorsa Sadigh
Chelsea Finn
Sergey Levine
66
356
0
20 May 2024
Meta-Control: Automatic Model-based Control Synthesis for Heterogeneous
  Robot Skills
Meta-Control: Automatic Model-based Control Synthesis for Heterogeneous Robot Skills
Tianhao Wei
Liqian Ma
Rui Chen
Weiye Zhao
Changliu Liu
45
3
0
18 May 2024
Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic
  Dexterous Manipulations
Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Dexterous Manipulations
Koffivi Fidele Gbagbe
Miguel Altamirano Cabrera
Ali Alabbas
Oussama Alyunes
Artem Lykov
Dzmitry Tsetserukou
LM&Ro
40
18
0
09 May 2024
Empowering Large Language Models on Robotic Manipulation with Affordance
  Prompting
Empowering Large Language Models on Robotic Manipulation with Affordance Prompting
Guangran Cheng
Chuheng Zhang
Wenzhe Cai
Li Zhao
Changyin Sun
Jiang Bian
LM&Ro
LLMAG
189
9
0
17 Apr 2024
RAIL: Robot Affordance Imagination with Large Language Models
RAIL: Robot Affordance Imagination with Large Language Models
Ceng Zhang
Xin Meng
Dongchen Qi
Gregory S. Chirikjian
LM&Ro
37
3
0
28 Mar 2024
Vid2Robot: End-to-end Video-conditioned Policy Learning with
  Cross-Attention Transformers
Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers
Vidhi Jain
Maria Attarian
Nikhil J. Joshi
Ayzaan Wahid
Danny Driess
...
Stefan Welker
Christine Chan
Igor Gilitschenski
Yonatan Bisk
Debidatta Dwibedi
68
29
0
19 Mar 2024
Scaling Instructable Agents Across Many Simulated Worlds
Scaling Instructable Agents Across Many Simulated Worlds
Sima Team
Maria Abi Raad
Arun Ahuja
Catarina Barros
F. Besse
...
Daan Wierstra
Duncan Williams
Nathaniel Wong
Sarah York
Nick Young
LM&Ro
115
38
0
13 Mar 2024
Learning Generalizable Feature Fields for Mobile Manipulation
Learning Generalizable Feature Fields for Mobile Manipulation
Ri-Zhao Qiu
Yafei Hu
Ge Yang
Yuchen Song
Yang Fu
...
Jiteng Mu
Ruihan Yang
Nikolay Atanasov
Sebastian Scherer
Xiaolong Wang
40
27
0
12 Mar 2024
Spatiotemporal Predictive Pre-training for Robotic Motor Control
Spatiotemporal Predictive Pre-training for Robotic Motor Control
Jiange Yang
Bei Liu
Jianlong Fu
Bocheng Pan
Gangshan Wu
Limin Wang
42
10
0
08 Mar 2024
Efficient Data Collection for Robotic Manipulation via Compositional
  Generalization
Efficient Data Collection for Robotic Manipulation via Compositional Generalization
Jensen Gao
Annie Xie
Ted Xiao
Chelsea Finn
Dorsa Sadigh
29
19
0
08 Mar 2024
Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting
Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting
L. Chen
Kush Hari
K. Dharmarajan
Chenfeng Xu
Quan Vuong
Ken Goldberg
49
20
0
29 Feb 2024
MOSAIC: A Modular System for Assistive and Interactive Cooking
MOSAIC: A Modular System for Assistive and Interactive Cooking
Huaxiaoyue Wang
K. Kedia
Juntao Ren
Rahma Abdullah
Atiksh Bhardwaj
...
Maximus Adrian Pace
Yash Sharma
Xiangwan Sun
Neha Sunkara
Sanjiban Choudhury
37
12
0
29 Feb 2024
Talk Through It: End User Directed Manipulation Learning
Talk Through It: End User Directed Manipulation Learning
Carl Winge
Adam Imdieke
Bahaa Aldeeb
Dongyeop Kang
Karthik Desingh
LM&Ro
43
1
0
19 Feb 2024
Real-World Robot Applications of Foundation Models: A Review
Real-World Robot Applications of Foundation Models: A Review
Kento Kawaharazuka
T. Matsushima
Andrew Gambardella
Jiaxian Guo
Chris Paxton
Andy Zeng
OffRL
VLM
LM&Ro
48
45
0
08 Feb 2024
MResT: Multi-Resolution Sensing for Real-Time Control with
  Vision-Language Models
MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models
Saumya Saxena
Mohit Sharma
Oliver Kroemer
34
4
0
25 Jan 2024
OK-Robot: What Really Matters in Integrating Open-Knowledge Models for
  Robotics
OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics
Peiqi Liu
Yaswanth Orru
Jay Vakil
Chris Paxton
Nur Muhammad (Mahi) Shafiullah
Lerrel Pinto
LM&Ro
VLM
103
39
0
22 Jan 2024
RePLan: Robotic Replanning with Perception and Language Models
RePLan: Robotic Replanning with Perception and Language Models
Marta Skreta
Zihan Zhou
Jia Lin Yuan
Kourosh Darvish
Alán Aspuru-Guzik
Animesh Garg
LM&Ro
LRM
40
26
0
08 Jan 2024
Object-Centric Instruction Augmentation for Robotic Manipulation
Object-Centric Instruction Augmentation for Robotic Manipulation
Junjie Wen
Yichen Zhu
Minjie Zhu
Jinming Li
Zhiyuan Xu
...
Chaomin Shen
Yaxin Peng
Dong Liu
Feifei Feng
Jian Tang
LM&Ro
69
16
0
05 Jan 2024
QUAR-VLA: Vision-Language-Action Model for Quadruped Robots
QUAR-VLA: Vision-Language-Action Model for Quadruped Robots
Pengxiang Ding
Han Zhao
Wenxuan Song
Zhitao Wang
Zhenyu Wei
Shangke Lyu
Ningxi Yang
Donglin Wang
32
19
0
22 Dec 2023
Mastering Stacking of Diverse Shapes with Large-Scale Iterative
  Reinforcement Learning on Real Robots
Mastering Stacking of Diverse Shapes with Large-Scale Iterative Reinforcement Learning on Real Robots
Thomas Lampe
A. Abdolmaleki
Sarah Bechtle
Sandy H. Huang
Jost Tobias Springenberg
...
Markus Wulfmeier
Jingwei Zhang
Francesco Nori
N. Heess
Martin Riedmiller
OffRL
32
9
0
18 Dec 2023
Toward General-Purpose Robots via Foundation Models: A Survey and
  Meta-Analysis
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
Yafei Hu
Quanting Xie
Vidhi Jain
Jonathan M Francis
Jay Patrikar
...
Xiaolong Wang
Sebastian A. Scherer
Z. Kira
Fei Xia
Yonatan Bisk
LM&Ro
AI4CE
32
63
0
14 Dec 2023
Foundation Models in Robotics: Applications, Challenges, and the Future
Foundation Models in Robotics: Applications, Challenges, and the Future
Roya Firoozi
Johnathan Tucker
Stephen Tian
Anirudha Majumdar
Jiankai Sun
...
Brian Ichter
Danny Driess
Jiajun Wu
Cewu Lu
Mac Schwager
LM&Ro
AI4CE
LRM
VLM
37
140
0
13 Dec 2023
Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language
  Models
Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models
Ivan Kapelyukh
Yifei Ren
Ignacio Alzugaray
Edward Johns
VLM
LM&Ro
25
20
0
07 Dec 2023
FoMo Rewards: Can we cast foundation models as reward functions?
FoMo Rewards: Can we cast foundation models as reward functions?
Ekdeep Singh Lubana
Johann Brehmer
P. D. Haan
Taco S. Cohen
OffRL
LRM
48
2
0
06 Dec 2023
Towards Generalizable Zero-Shot Manipulation via Translating Human
  Interaction Plans
Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans
Homanga Bharadhwaj
Abhi Gupta
Vikash Kumar
Shubham Tulsiani
LM&Ro
20
38
0
01 Dec 2023
Transfer Learning in Robotics: An Upcoming Breakthrough? A Review of
  Promises and Challenges
Transfer Learning in Robotics: An Upcoming Breakthrough? A Review of Promises and Challenges
Noémie Jaquier
Michael C. Welle
A. Gams
Kunpeng Yao
Bernardo Fichera
A. Billard
Aleš Ude
Tamim Asfour
Danica Kragic
30
14
0
29 Nov 2023
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language
  Model-based Agents in Real-world Systems
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems
Yilun Kong
Jingqing Ruan
Yihong Chen
Bin Zhang
Tianpeng Bao
...
Xiaoru Hu
Hangyu Mao
Ziyue Li
Xingyu Zeng
Rui Zhao
LLMAG
37
37
0
19 Nov 2023
Learning Generalizable Manipulation Policies with Object-Centric 3D
  Representations
Learning Generalizable Manipulation Policies with Object-Centric 3D Representations
Yifeng Zhu
Zhenyu Jiang
Peter Stone
Yuke Zhu
3DPC
24
43
0
22 Oct 2023
One-Shot Imitation Learning: A Pose Estimation Perspective
One-Shot Imitation Learning: A Pose Estimation Perspective
Pietro Vitiello
Kamil Dreczkowski
Edward Johns
26
18
0
18 Oct 2023
Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion
  Models
Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models
Kevin Black
Mitsuhiko Nakamoto
P. Atreya
Homer Walke
Chelsea Finn
Aviral Kumar
Sergey Levine
DiffM
LM&Ro
32
132
0
16 Oct 2023
LgTS: Dynamic Task Sampling using LLM-generated sub-goals for
  Reinforcement Learning Agents
LgTS: Dynamic Task Sampling using LLM-generated sub-goals for Reinforcement Learning Agents
Yash Shukla
Wenchang Gao
Vasanth Sarathy
Alvaro Velasquez
Robert Wright
Jivko Sinapov
27
9
0
14 Oct 2023
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment: Robotic Learning Datasets and RT-X Models
Open X-Embodiment Collaboration
Abby OÑeill
Abdul Rehman
Abhinav Gupta
Abhiram Maddukuri
...
Zhuo Xu
Zichen Jeff Cui
Zichen Zhang
Zipeng Fu
Zipeng Lin
LM&Ro
44
464
0
13 Oct 2023
GROOT: Learning to Follow Instructions by Watching Gameplay Videos
GROOT: Learning to Follow Instructions by Watching Gameplay Videos
Shaofei Cai
Bowei Zhang
Zihao Wang
Xiaojian Ma
Anji Liu
Yitao Liang
83
26
0
12 Oct 2023
FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation
FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation
Xinyu Sun
Peihao Chen
Jugang Fan
Thomas H. Li
Jian Chen
Mingkui Tan
32
12
0
11 Oct 2023
Bridging Low-level Geometry to High-level Concepts in Visual Servoing of
  Robot Manipulation Task Using Event Knowledge Graphs and Vision-Language
  Models
Bridging Low-level Geometry to High-level Concepts in Visual Servoing of Robot Manipulation Task Using Event Knowledge Graphs and Vision-Language Models
Chen Jiang
Martin Jägersand
37
1
0
05 Oct 2023
Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping
Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping
Adam Rashid
Satvik Sharma
C. Kim
J. Kerr
L. Chen
Angjoo Kanazawa
Ken Goldberg
62
85
0
14 Sep 2023
TPTU: Large Language Model-based AI Agents for Task Planning and Tool
  Usage
TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage
Jingqing Ruan
Yihong Chen
Bin Zhang
Zhiwei Xu
Tianpeng Bao
...
Shiwei Shi
Hangyu Mao
Ziyue Li
Xingyu Zeng
Rui Zhao
LLMAG
LM&Ro
44
32
0
07 Aug 2023
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic
  Control
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
...
Ted Xiao
Peng-Tao Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
LRM
30
1,100
0
28 Jul 2023
Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation
Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation
Bokui (William) Shen
Ge Yang
Alan Yu
J. Wong
L. Kaelbling
Phillip Isola
VLM
29
104
0
27 Jul 2023
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with
  Language Models
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
Wenlong Huang
Chen Wang
Ruohan Zhang
Yunzhu Li
Jiajun Wu
Li Fei-Fei
LM&Ro
33
480
0
12 Jul 2023
Empirical Analysis of a Segmentation Foundation Model in Prostate
  Imaging
Empirical Analysis of a Segmentation Foundation Model in Prostate Imaging
Heejong Kim
V. Butoi
Adrian V. Dalca
Daniel J. A. Margolis
M. Sabuncu
OOD
MedIm
19
6
0
06 Jul 2023
DoReMi: Grounding Language Model by Detecting and Recovering from
  Plan-Execution Misalignment
DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution Misalignment
Yanjiang Guo
Yen-Jen Wang
Lihan Zha
Zheyuan Jiang
Jianyu Chen
LM&Ro
24
39
0
01 Jul 2023
KITE: Keypoint-Conditioned Policies for Semantic Manipulation
KITE: Keypoint-Conditioned Policies for Semantic Manipulation
Priya Sundaresan
Suneel Belkhale
Dorsa Sadigh
Jeannette Bohg
LM&Ro
28
24
0
29 Jun 2023
HomeRobot: Open-Vocabulary Mobile Manipulation
HomeRobot: Open-Vocabulary Mobile Manipulation
Sriram Yenamandra
A. Ramachandran
Karmesh Yadav
Austin S. Wang
Mukul Khanna
...
Devendra Singh Chaplot
Dhruv Batra
Roozbeh Mottaghi
Yonatan Bisk
Chris Paxton
LM&Ro
44
79
0
20 Jun 2023
Value function estimation using conditional diffusion models for control
Value function estimation using conditional diffusion models for control
Bogdan Mazoure
Walter A. Talbott
Miguel Angel Bautista
R. Devon Hjelm
Alexander Toshev
J. Susskind
DiffM
25
4
0
09 Jun 2023
Transferring Foundation Models for Generalizable Robotic Manipulation
Transferring Foundation Models for Generalizable Robotic Manipulation
Jiange Yang
Wenhui Tan
Chuhao Jin
Keling Yao
Bei Liu
Jianlong Fu
Ruihua Song
Gangshan Wu
Limin Wang
LM&Ro
47
6
0
09 Jun 2023
GAN-MPC: Training Model Predictive Controllers with Parameterized Cost
  Functions using Demonstrations from Non-identical Experts
GAN-MPC: Training Model Predictive Controllers with Parameterized Cost Functions using Demonstrations from Non-identical Experts
Returaj Burnwal
Anirban Santara
Nirav P. Bhatt
Balaraman Ravindran
Gaurav Aggarwal
19
0
0
30 May 2023
Towards Generalist Robots: A Promising Paradigm via Generative
  Simulation
Towards Generalist Robots: A Promising Paradigm via Generative Simulation
Zhou Xian
Théophile Gervet
Zhenjia Xu
Yi-Ling Qiao
Tsun-Hsuan Wang
Yian Wang
LM&Ro
52
7
0
17 May 2023
Grounding Classical Task Planners via Vision-Language Models
Grounding Classical Task Planners via Vision-Language Models
Xiaohan Zhang
Yan Ding
S. Amiri
Hao Yang
Andy Kaminski
Chad Esselink
Shiqi Zhang
18
16
0
17 Apr 2023
Where are we in the search for an Artificial Visual Cortex for Embodied
  Intelligence?
Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?
Arjun Majumdar
Karmesh Yadav
Sergio Arnaud
Yecheng Jason Ma
Claire Chen
...
Dhruv Batra
Yixin Lin
Oleksandr Maksymets
Aravind Rajeswaran
Franziska Meier
LM&Ro
19
173
0
31 Mar 2023
Previous
123
Next