Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.01378
Cited By
Vision-Language Foundation Models as Effective Robot Imitators
2 November 2023
Xinghang Li
Minghuan Liu
Hanbo Zhang
Cunjun Yu
Jie Xu
Hongtao Wu
Chi-Hou Cheang
Ya Jing
Weinan Zhang
Huaping Liu
Hang Li
Tao Kong
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision-Language Foundation Models as Effective Robot Imitators"
50 / 111 papers shown
Title
REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
Chenxi Jiang
Chuhao Zhou
Jianfei Yang
9
0
0
16 May 2025
Unveiling the Potential of Vision-Language-Action Models with Open-Ended Multimodal Instructions
Wei Zhao
Gongsheng Li
Zhefei Gong
Pengxiang Ding
Han Zhao
Donglin Wang
LM&Ro
22
0
0
16 May 2025
DataMIL: Selecting Data for Robot Imitation Learning with Datamodels
Shivin Dass
Alaa Khaddaj
Logan Engstrom
Aleksander Madry
Andrew Ilyas
Roberto Martín-Martín
26
0
0
14 May 2025
VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation
Chaofan Zhang
Peng Hao
Xiaoge Cao
Xiaoshuai Hao
Shaowei Cui
Shuo Wang
32
0
0
14 May 2025
ReinboT: Amplifying Robot Visual-Language Manipulation with Reinforcement Learning
Hongyin Zhang
Zifeng Zhuang
Han Zhao
Pengxiang Ding
Hongchao Lu
Donglin Wang
OffRL
44
0
0
12 May 2025
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks
V. Bhat
Yu-Hsiang Lan
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
52
0
0
09 May 2025
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Qingwen Bu
Yanting Yang
Jisong Cai
Shenyuan Gao
Guanghui Ren
Maoqing Yao
Ping Luo
Hongyang Li
134
0
0
09 May 2025
OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation
Can Cui
Pengxiang Ding
Wenxuan Song
Shuanghao Bai
Xinyang Tong
...
Yang Liu
Bofang Jia
Han Zhao
Siteng Huang
Donglin Wang
26
1
0
06 May 2025
Task Reconstruction and Extrapolation for
π
0
π_0
π
0
using Text Latent
Quanyi Li
42
0
0
06 May 2025
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
Huajie Tan
Xiaoshuai Hao
Minglan Lin
Pengwei Wang
Yaoxu Lyu
Mingyu Cao
Zhongyuan Wang
Shanghang Zhang
LM&Ro
48
0
0
06 May 2025
CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
Xiaoqi Li
Lingyun Xu
Hao Fei
Jiaming Liu
Yan Shen
...
Jiahui Xu
Liang Heng
Siyuan Huang
Shanghang Zhang
Hao Dong
LM&Ro
51
0
0
04 May 2025
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
Linus Nwankwo
Bjoern Ellensohn
Ozan Özdenizci
Elmar Rueckert
LM&Ro
58
0
0
03 May 2025
Dynamic Robot Tool Use with Vision Language Models
Noah Trupin
Zixing Wang
A. H. Qureshi
37
0
0
02 May 2025
Few-Shot Vision-Language Action-Incremental Policy Learning
Mingchen Song
Xiang Deng
Guoqiang Zhong
Qi Lv
Jia Wan
Yinchuan Li
Haifeng Zhang
Weili Guan
41
0
0
22 Apr 2025
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Le Wang
Zonghao Ying
Tianyuan Zhang
Siyuan Liang
Shengshan Hu
Mingchuan Zhang
A. Liu
Xianglong Liu
AAML
33
1
0
19 Apr 2025
RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics
Zhiyuan Zhang
Yuxin He
Yong Sun
Junyu Shi
Lijiang Liu
Qiang Nie
VLM
49
0
0
02 Apr 2025
Robust Offline Imitation Learning Through State-level Trajectory Stitching
Shuze Wang
Yunpeng Mei
Hongjie Cao
Yetian Yuan
Gang Wang
Jian Sun
Jie Chen
OffRL
39
0
0
28 Mar 2025
REMAC: Self-Reflective and Self-Evolving Multi-Agent Collaboration for Long-Horizon Robot Manipulation
Puzhen Yuan
Angyuan Ma
Yunchao Yao
Huaxiu Yao
Masayoshi Tomizuka
Mingyu Ding
LM&Ro
64
1
0
28 Mar 2025
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation
Rongyu Zhang
Menghang Dong
Yuan Zhang
Liang Heng
Xiaowei Chi
Gaole Dai
Li Du
Dan Wang
Yuan Du
MoE
84
0
0
26 Mar 2025
DataPlatter: Boosting Robotic Manipulation Generalization with Minimal Costly Data
Liming Zheng
Feng Yan
Fanfan Liu
C. Feng
Yufeng Zhong
Yiyang Huang
Lin Ma
47
0
0
25 Mar 2025
RoboFlamingo-Plus: Fusion of Depth and RGB Perception with Vision-Language Models for Enhanced Robotic Manipulation
Sheng Wang
VLM
76
2
0
25 Mar 2025
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy
Zhi Hou
Tianyi Zhang
Yuwen Xiong
Haonan Duan
Hengjun Pu
...
Chengyang Zhao
X. Zhu
Yu Qiao
Jifeng Dai
Yuxiao Chen
59
1
0
25 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
68
25
0
18 Mar 2025
MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation
Zhenyu Wu
Yuheng Zhou
Xiuwei Xu
Zehua Wang
Haibin Yan
49
2
0
17 Mar 2025
EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks
Yi Zhang
Qiang Zhang
Xiaozhu Ju
Ziqiang Liu
Jilei Mao
...
Jiaxu Wang
Yiqun Duan
Jiahang Cao
Renjing Xu
Jian Tang
LM&Ro
LRM
62
0
0
14 Mar 2025
Towards Fast, Memory-based and Data-Efficient Vision-Language Policy
Haoxuan Li
Sixu Yan
Yong Li
Xinggang Wang
LM&Ro
64
0
0
13 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
Shanghang Zhang
72
8
0
13 Mar 2025
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments
Dongping Li
Tielong Cai
Tianci Tang
Wenhao Chai
Katherine Rose Driggs-Campbell
Gaoang Wang
LM&Ro
61
0
0
11 Mar 2025
MoRE: Unlocking Scalability in Reinforcement Learning for Quadruped Vision-Language-Action Models
Han Zhao
Wenxuan Song
Donglin Wang
Xinyang Tong
Pengxiang Ding
Xuelian Cheng
Zongyuan Ge
55
2
0
11 Mar 2025
TLA: Tactile-Language-Action Model for Contact-Rich Manipulation
Peng Hao
Chaofan Zhang
Dingzhe Li
Xiaoge Cao
Xiaoshuai Hao
Shaowei Cui
Shuo Wang
LM&Ro
52
7
0
11 Mar 2025
Accelerating Vision-Language-Action Model Integrated with Action Chunking via Parallel Decoding
Wenxuan Song
Jiayi Chen
Pengxiang Ding
Han Zhao
Wei Zhao
Zhide Zhong
Zongyuan Ge
Jun Ma
Haoang Li
54
3
0
04 Mar 2025
VLAS: Vision-Language-Action Model With Speech Instructions For Customized Robot Manipulation
Wei Zhao
Pengxiang Ding
Hao Fei
Zhefei Gong
Shuanghao Bai
Han Zhao
Donglin Wang
93
6
0
24 Feb 2025
X-IL: Exploring the Design Space of Imitation Learning Policies
Xiaogang Jia
Atalay Donat
Xi Huang
Xuan Zhao
Denis Blessing
...
Han A. Wang
Hanyi Zhang
Qian Wang
Rudolf Lioutikov
Gerhard Neumann
91
1
0
20 Feb 2025
Towards Fusing Point Cloud and Visual Representations for Imitation Learning
Atalay Donat
Xiaogang Jia
Xi Huang
Aleksandar Taranovic
Denis Blessing
Ge Li
Hongyi Zhou
Hanyi Zhang
Rudolf Lioutikov
Gerhard Neumann
3DPC
SSL
73
1
0
20 Feb 2025
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Zekun Qi
Wenyao Zhang
Yufei Ding
Runpei Dong
Xinqiang Yu
...
Xin Jin
Kaisheng Ma
Zhizheng Zhang
He Wang
Li Yi
LM&Ro
131
4
0
18 Feb 2025
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation
Kun Wu
Chengkai Hou
Jiaming Liu
Zhengping Che
Xiaozhu Ju
...
Zhenyu Wang
Pengju An
Siyuan Qian
Shanghang Zhang
Jian Tang
LM&Ro
113
15
0
17 Feb 2025
3D-Grounded Vision-Language Framework for Robotic Task Planning: Automated Prompt Synthesis and Supervised Reasoning
Guoqin Tang
Qingxuan Jia
Zeyuan Huang
Gang Chen
Ning Ji
Zhipeng Yao
66
0
0
13 Feb 2025
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints
Mingjie Pan
Jiyao Zhang
Tianshu Wu
Yinghao Zhao
Wenlong Gao
Hao Dong
LM&Ro
55
6
0
08 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
88
12
0
06 Jan 2025
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation
Yang Tian
Sizhe Yang
Jia Zeng
P. Wang
Dahua Lin
Hao Dong
Jiangmiao Pang
84
17
0
19 Dec 2024
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models
Xinghang Li
Peiyan Li
Minghuan Liu
Dong Wang
Jirong Liu
Bingyi Kang
Xiao Ma
Tao Kong
Hanbo Zhang
Huaping Liu
LM&Ro
97
18
0
18 Dec 2024
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
Moritz Reuss
Jyothish Pari
Pulkit Agrawal
Rudolf Lioutikov
DiffM
MoE
84
6
0
17 Dec 2024
RoboMM: All-in-One Multimodal Large Model for Robotic Manipulation
Feng Yan
Fanfan Liu
Liming Zheng
Yufeng Zhong
Yiyang Huang
Zechao Guan
Chengjian Feng
Lin Ma
84
2
0
10 Dec 2024
DaDu-E: Rethinking the Role of Large Language Model in Robotic Computing Pipeline
Wenhao Sun
Sai Hou
Zehao Wang
Bo Yu
Shaoshan Liu
Xu Yang
Shuai Liang
Yiming Gan
Yinhe Han
LLMAG
121
2
0
02 Dec 2024
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Qixiu Li
Yaobo Liang
Zeyu Wang
Lin Luo
Xi Chen
...
Jianmin Bao
Dong Chen
Yuanchun Shi
Jiaolong Yang
B. Guo
LM&Ro
83
23
0
29 Nov 2024
RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
Weixin Mao
Weiheng Zhong
Zhou Jiang
Dong Fang
Zhongyue Zhang
...
Fan Jia
Tiancai Wang
Haoqiang Fan
Osamu Yoshie
Osamu Yoshie
119
5
0
29 Nov 2024
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Yueru Jia
Jiaming Liu
Sixiang Chen
Chenyang Gu
Zihan Wang
...
Lily Lee
Pengwei Wang
Zhongyuan Wang
Renrui Zhang
Shanghang Zhang
89
11
0
27 Nov 2024
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation
Minheng Ni
Lei Zhang
Zhaoyu Chen
Lefei Zhang
Wangmeng Zuo
74
1
0
27 Nov 2024
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Jiange Yang
Haoyi Zhu
Yanjie Wang
Gangshan Wu
Tong He
Limin Wang
103
2
0
21 Nov 2024
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Taowen Wang
Dongfang Liu
James Liang
Wenhao Yang
Qifan Wang
Cheng Han
Jiebo Luo
Ruixiang Tang
Ruixiang Tang
AAML
79
3
0
18 Nov 2024
1
2
3
Next