Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.03464
Cited By
v1
v2 (latest)
Generative Artificial Intelligence in Robotic Manipulation: A Survey
5 March 2025
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
Hangjie Yuan
Chao Zhao
Tao Feng
M. Y. Wang
Qifeng Chen
Jia Pan
Wei Zhang
Bo Yang
Hua Chen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Generative Artificial Intelligence in Robotic Manipulation: A Survey"
50 / 214 papers shown
Title
Surfer: Progressive Reasoning with World Models for Robotic Manipulation
Pengzhen Ren
Kaiwen Zhang
Hetao Zheng
Zixuan Li
Yuhang Wen
Fengda Zhu
Mas Ma
Xiaodan Liang
LM&Ro
LRM
32
4
0
20 Jun 2023
NBMOD: Find It and Grasp It in Noisy Background
Boyuan Cao
Xinyu Zhou
Congmin Guo
Baohua Zhang
Yuchen Liu
Qianqiu Tan
92
4
0
17 Jun 2023
Language to Rewards for Robotic Skill Synthesis
Wenhao Yu
Nimrod Gileadi
Chuyuan Fu
Sean Kirmani
Kuang-Huei Lee
...
N. Heess
Dorsa Sadigh
Jie Tan
Yuval Tassa
F. Xia
LM&Ro
83
279
0
14 Jun 2023
LIV: Language-Image Representations and Rewards for Robotic Control
Yecheng Jason Ma
William Liang
Vaidehi Som
Vikash Kumar
Amy Zhang
Osbert Bastani
Dinesh Jayaraman
LM&Ro
85
128
0
01 Jun 2023
PaLI-X: On Scaling up a Multilingual Vision and Language Model
Xi Chen
Josip Djolonga
Piotr Padlewski
Basil Mustafa
Soravit Changpinyo
...
Mojtaba Seyedhosseini
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
VLM
128
202
0
29 May 2023
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
Yao Mu
Qinglong Zhang
Mengkang Hu
Wen Wang
Mingyu Ding
Jun Jin
Bin Wang
Jifeng Dai
Yu Qiao
Ping Luo
LM&Ro
LRM
83
242
0
24 May 2023
Large Language Models as Commonsense Knowledge for Large-Scale Task Planning
Zirui Zhao
W. Lee
David Hsu
LRM
LLMAG
LM&Ro
77
223
0
23 May 2023
PastNet: Introducing Physical Inductive Biases for Spatio-temporal Video Prediction
Hao Wu
Wei Xion
Fan Xu
Xian-Sheng Hua
C. L. Philip Chen
Xiansheng Hua
AI4TS
183
28
0
19 May 2023
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
Siyuan Huang
Zhengkai Jiang
Hao Dong
Yu Qiao
Peng Gao
Hongsheng Li
LM&Ro
76
94
0
18 May 2023
Going Denser with Open-Vocabulary Part Segmentation
Pei Sun
Shoufa Chen
Chenchen Zhu
Fanyi Xiao
Ping Luo
Saining Xie
Zhicheng Yan
ObjD
VLM
66
48
0
18 May 2023
EC^2: Emergent Communication for Embodied Control
Yao Mu
Shunyu Yao
Mingyu Ding
Ping Luo
Chuang Gan
LM&Ro
55
20
0
19 Apr 2023
DINOv2: Learning Robust Visual Features without Supervision
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
...
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
357
3,479
0
14 Apr 2023
Segment Anything
A. Kirillov
Eric Mintun
Nikhila Ravi
Hanzi Mao
Chloe Rolland
...
Spencer Whitehead
Alexander C. Berg
Wan-Yen Lo
Piotr Dollár
Ross B. Girshick
MLLM
VLM
336
7,365
0
05 Apr 2023
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
226
1,150
0
27 Mar 2023
Text2Motion: From Natural Language Instructions to Feasible Plans
Kevin Qinghong Lin
Christopher Agia
Toki Migimatsu
Marco Pavone
Jeannette Bohg
LM&Ro
69
280
0
21 Mar 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.4K
14,631
0
15 Mar 2023
Chat with the Environment: Interactive Multimodal Perception Using Large Language Models
Xufeng Zhao
Mengdi Li
C. Weber
Muhammad Burhan Hafez
S. Wermter
LLMAG
LM&Ro
LRM
144
49
0
14 Mar 2023
Vision-Language Models as Success Detectors
Yuqing Du
Ksenia Konyushkova
Misha Denil
A. Raju
Jessica Landon
Felix Hill
Nando de Freitas
Serkan Cabi
MLLM
LRM
119
86
0
13 Mar 2023
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Cheng Chi
Zhenjia Xu
S. Feng
Eric A. Cousineau
Yilun Du
Benjamin Burchfiel
Russ Tedrake
Shuran Song
347
1,189
0
07 Mar 2023
PaLM-E: An Embodied Multimodal Language Model
Danny Driess
F. Xia
Mehdi S. M. Sajjadi
Corey Lynch
Aakanksha Chowdhery
...
Marc Toussaint
Klaus Greff
Andy Zeng
Igor Mordatch
Peter R. Florence
LM&Ro
109
1,663
0
06 Mar 2023
UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy
Yinzhen Xu
Weikang Wan
Jialiang Zhang
Haoran Liu
Zikang Shan
...
Yijia Weng
Jiayi Chen
Tengyu Liu
Li Yi
He Wang
146
119
0
02 Mar 2023
Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents
Wenlong Huang
Fei Xia
Dhruv Shah
Danny Driess
Andy Zeng
...
Pete Florence
Igor Mordatch
Sergey Levine
Karol Hausman
Brian Ichter
LM&Ro
73
49
0
01 Mar 2023
Scaling Robot Learning with Semantically Imagined Experience
Tianhe Yu
Ted Xiao
Austin Stone
Jonathan Tompson
Anthony Brohan
...
M. Dee
Jodilyn Peralta
Brian Ichter
Karol Hausman
F. Xia
LM&Ro
DiffM
66
155
0
22 Feb 2023
GenAug: Retargeting behaviors to unseen situations via Generative Augmentation
Zoey Chen
Sho Kiami
Abhishek Gupta
Vikash Kumar
LM&Ro
67
86
0
13 Feb 2023
Learning Universal Policies via Text-Guided Video Generation
Yilun Du
Mengjiao Yang
Bo Dai
H. Dai
Ofir Nachum
J. Tenenbaum
Dale Schuurmans
Pieter Abbeel
PINN
LM&Ro
102
258
0
31 Jan 2023
Distilling Internet-Scale Vision-Language Models into Embodied Agents
T. Sumers
Kenneth Marino
Arun Ahuja
Rob Fergus
Ishita Dasgupta
LM&Ro
61
25
0
29 Jan 2023
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
Mahmoud Assran
Quentin Duval
Ishan Misra
Piotr Bojanowski
Pascal Vincent
Michael G. Rabbat
Yann LeCun
Nicolas Ballas
SSL
AI4TS
MDE
78
360
0
19 Jan 2023
Mastering Diverse Domains through World Models
Danijar Hafner
J. Pašukonis
Jimmy Ba
Timothy Lillicrap
70
601
0
10 Jan 2023
Policy Adaptation from Foundation Model Feedback
Yuying Ge
Annabella Macaluso
Erran L. Li
Ping Luo
Xiaolong Wang
LM&Ro
51
13
0
14 Dec 2022
RT-1: Robotics Transformer for Real-World Control at Scale
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Joseph Dabis
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
113
1,150
0
13 Dec 2022
CACTI: A Framework for Scalable Multi-Task Multi-Scene Visual Imitation Learning
Zhao Mandi
Homanga Bharadhwaj
Vincent Moens
Shuran Song
Aravind Rajeswaran
Vikash Kumar
LM&Ro
88
76
0
12 Dec 2022
Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
Ted Xiao
Harris Chan
P. Sermanet
Ayzaan Wahid
Anthony Brohan
Karol Hausman
Sergey Levine
Jonathan Tompson
VLM
LM&Ro
77
68
0
21 Nov 2022
InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks
Aleksander Holynski
Alexei A. Efros
DiffM
207
1,813
0
17 Nov 2022
TAX-Pose: Task-Specific Cross-Pose Estimation for Robot Manipulation
Chuer Pan
Brian Okorn
Harry Zhang
Ben Eisner
David Held
80
59
0
17 Nov 2022
Learning Reward Functions for Robotic Manipulation by Observing Humans
Minttu Alakuijala
Gabriel Dulac-Arnold
Julien Mairal
Jean Ponce
Cordelia Schmid
OffRL
58
27
0
16 Nov 2022
Leveraging Fully Observable Policies for Learning under Partial Observability
Hai V. Nguyen
Andrea Baisero
Dian Wang
Chris Amato
Robert Platt
OffRL
77
20
0
03 Nov 2022
Learning Robust Real-World Dexterous Grasping Policies via Implicit Shape Augmentation
Zoey Qiuyu Chen
Karl Van Wyk
Yu-Wei Chao
Wei Yang
Arsalan Mousavian
Abhishek Gupta
Dieter Fox
59
29
0
24 Oct 2022
STAP: Sequencing Task-Agnostic Policies
Christopher Agia
Toki Migimatsu
Jiajun Wu
Jeannette Bohg
80
20
0
21 Oct 2022
ExAug: Robot-Conditioned Navigation Policies via Geometric Experience Augmentation
Noriaki Hirose
Dhruv Shah
A. Sridhar
Sergey Levine
51
29
0
14 Oct 2022
Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks
Albert Yu
Raymond J. Mooney
LM&Ro
53
20
0
10 Oct 2022
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
101
355
0
06 Oct 2022
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh
Vitalis Vosylius
Edward Johns
LM&Ro
DiffM
190
148
0
05 Oct 2022
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
Yecheng Jason Ma
Shagun Sodhani
Dinesh Jayaraman
Osbert Bastani
Vikash Kumar
Amy Zhang
SSL
OffRL
77
304
0
30 Sep 2022
An Outlier Exposure Approach to Improve Visual Anomaly Detection Performance for Mobile Robots
Dario Mantegazza
Alessandro Giusti
L. Gambardella
Jérôme Guzzi
47
13
0
20 Sep 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
Dieter Fox
LM&Ro
255
497
0
12 Sep 2022
SE(3)-DiffusionFields: Learning smooth cost functions for joint grasp and motion optimization through diffusion
Julen Urain
Niklas Funk
Jan Peters
Georgia Chalvatzaki
DiffM
127
127
0
08 Sep 2022
Human-to-Robot Imitation in the Wild
Shikhar Bahl
Abhi Gupta
Deepak Pathak
97
170
0
19 Jul 2022
Inner Monologue: Embodied Reasoning through Planning with Language Models
Wenlong Huang
F. Xia
Ted Xiao
Harris Chan
Jacky Liang
...
Tomas Jackson
Linda Luu
Sergey Levine
Karol Hausman
Brian Ichter
LLMAG
LM&Ro
LRM
131
916
0
12 Jul 2022
Deep Learning Approaches to Grasp Synthesis: A Review
Rhys Newbury
Morris Gu
Lachlan Chumbley
Arsalan Mousavian
Clemens Eppner
...
A. Morales
Tamim Asfour
Danica Kragic
Dieter Fox
Akansel Cosgun
82
170
0
06 Jul 2022
Learning Diverse and Physically Feasible Dexterous Grasps with Generative Model and Bilevel Optimization
A. Wu
Michelle Guo
Chenxi Liu
104
30
0
01 Jul 2022
Previous
1
2
3
4
5
Next