ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.01378
  4. Cited By
Vision-Language Foundation Models as Effective Robot Imitators
v1v2v3 (latest)

Vision-Language Foundation Models as Effective Robot Imitators

2 November 2023
Xinghang Li
Minghuan Liu
Hanbo Zhang
Cunjun Yu
Jie Xu
Hongtao Wu
Chi-Hou Cheang
Ya Jing
Weinan Zhang
Huaping Liu
Hang Li
Tao Kong
    LM&Ro
ArXiv (abs)PDFHTML

Papers citing "Vision-Language Foundation Models as Effective Robot Imitators"

49 / 49 papers shown
Title
Dynamic Double Space Tower
Dynamic Double Space Tower
Weikai Sun
Shijie Song
Han Wang
19
0
0
13 Jun 2025
RationalVLA: A Rational Vision-Language-Action Model with Dual System
RationalVLA: A Rational Vision-Language-Action Model with Dual System
Wenxuan Song
Jiayi Chen
Wenxue Li
Xu He
Han Zhao
...
Xinhu Zheng
Yanfeng Guo
Hesheng Wang
Yunhui Liu
Haoang Li
LM&Ro
169
1
0
12 Jun 2025
GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation
GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation
Ning Gao
Yilun Chen
Shuai Yang
Xinyi Chen
Yang Tian
Hao Li
Haifeng Huang
Hanqing Wang
Tai Wang
Jiangmiao Pang
LM&Ro
133
0
0
12 Jun 2025
SAFE: Multitask Failure Detection for Vision-Language-Action Models
SAFE: Multitask Failure Detection for Vision-Language-Action Models
Qiao Gu
Yuanliang Ju
Shengxiang Sun
Igor Gilitschenski
Haruki Nishimura
Masha Itkina
Florian Shkurti
52
0
0
11 Jun 2025
CheckManual: A New Challenge and Benchmark for Manual-based Appliance Manipulation
Yuxing Long
Jiyao Zhang
Mingjie Pan
Tianshu Wu
Taewhan Kim
Hao Dong
71
0
0
11 Jun 2025
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
Yantai Yang
Yuhao Wang
Zichen Wen
Luo Zhongwei
Chang Zou
Zhipeng Zhang
Chuan Wen
Linfeng Zhang
VLM
78
0
0
11 Jun 2025
SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios
SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios
Lingwei Dang
Ruizhi Shao
Hongwen Zhang
Wei Min
Yebin Liu
Qingyao Wu
DiffMVGen
95
0
0
03 Jun 2025
Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner
Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner
Chunhui Zhang
Z. Ouyang
Kwonjoon Lee
Nakul Agarwal
Sean Dae Houlihan
Soroush Vosoughi
Shao-Yuan Lo
LRM
74
0
0
02 Jun 2025
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
Dongyoung Kim
S. Park
Huiwon Jang
Jinwoo Shin
Jaehyung Kim
Younggyo Seo
LRM
39
0
0
29 May 2025
OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning
OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning
Fanqi Lin
Ruiqian Nai
Yingdong Hu
Jiacheng You
Junming Zhao
Yang Gao
LRM
101
0
0
17 May 2025
REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
Chenxi Jiang
Chuhao Zhou
Jianfei Yang
69
0
0
16 May 2025
ReinboT: Amplifying Robot Visual-Language Manipulation with Reinforcement Learning
ReinboT: Amplifying Robot Visual-Language Manipulation with Reinforcement Learning
Hongyin Zhang
Zifeng Zhuang
Han Zhao
Pengxiang Ding
Hongchao Lu
Donglin Wang
OffRL
132
0
0
12 May 2025
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Qingwen Bu
Yanting Yang
Jisong Cai
Shenyuan Gao
Guanghui Ren
Maoqing Yao
Ping Luo
Hongyang Li
427
10
0
09 May 2025
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
Huajie Tan
Xiaoshuai Hao
Cheng Chi
Minglan Lin
Yaoxu Lyu
...
Yulong Ao
Yonghua Lin
Pengwei Wang
Zhongyuan Wang
Shanghang Zhang
LM&Ro
124
0
0
06 May 2025
Task Reconstruction and Extrapolation for $π_0$ using Text Latent
Task Reconstruction and Extrapolation for π0π_0π0​ using Text Latent
Quanyi Li
105
0
0
06 May 2025
CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
Xiaoqi Li
Lingyun Xu
Hao Fei
Jiaming Liu
Yan Shen
...
Jiahui Xu
Liang Heng
Siyuan Huang
Shanghang Zhang
Hao Dong
LM&Ro
124
0
0
04 May 2025
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
Linus Nwankwo
Bjoern Ellensohn
Ozan Özdenizci
Elmar Rueckert
LM&Ro
246
0
0
03 May 2025
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Le Wang
Zonghao Ying
Tianyuan Zhang
Siyuan Liang
Shengshan Hu
Mingchuan Zhang
A. Liu
Xianglong Liu
AAML
179
4
0
19 Apr 2025
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation
Rongyu Zhang
Menghang Dong
Yuan Zhang
Liang Heng
Xiaowei Chi
Gaole Dai
Li Du
Dan Wang
Yuan Du
MoE
169
4
0
26 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
158
68
0
18 Mar 2025
EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks
Yi Zhang
Qiang Zhang
Xiaozhu Ju
Ziqiang Liu
Jilei Mao
...
Jiaxu Wang
Yiqun Duan
Jiahang Cao
Renjing Xu
Jian Tang
LM&RoLRM
112
0
0
14 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
Shanghang Zhang
196
20
0
13 Mar 2025
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments
Dongping Li
Tielong Cai
Tianci Tang
Wenhao Chai
Katherine Rose Driggs-Campbell
Gaoang Wang
LM&Ro
240
0
0
11 Mar 2025
VLAS: Vision-Language-Action Model With Speech Instructions For Customized Robot Manipulation
VLAS: Vision-Language-Action Model With Speech Instructions For Customized Robot Manipulation
Wei Zhao
Pengxiang Ding
Hao Fei
Zhefei Gong
Shuanghao Bai
Han Zhao
Donglin Wang
150
11
0
24 Feb 2025
X-IL: Exploring the Design Space of Imitation Learning Policies
X-IL: Exploring the Design Space of Imitation Learning Policies
Xiaogang Jia
Atalay Donat
Xi Huang
Xuan Zhao
Denis Blessing
...
Han A. Wang
Hanyi Zhang
Qian Wang
Rudolf Lioutikov
Gerhard Neumann
153
1
0
20 Feb 2025
Towards Fusing Point Cloud and Visual Representations for Imitation Learning
Towards Fusing Point Cloud and Visual Representations for Imitation Learning
Atalay Donat
Xiaogang Jia
Xi Huang
Aleksandar Taranovic
Denis Blessing
Ge Li
Hongyi Zhou
Hanyi Zhang
Rudolf Lioutikov
Gerhard Neumann
3DPCSSL
149
1
0
20 Feb 2025
3D-Grounded Vision-Language Framework for Robotic Task Planning: Automated Prompt Synthesis and Supervised Reasoning
3D-Grounded Vision-Language Framework for Robotic Task Planning: Automated Prompt Synthesis and Supervised Reasoning
Guoqin Tang
Qingxuan Jia
Zeyuan Huang
Gang Chen
Ning Ji
Zhipeng Yao
114
0
0
13 Feb 2025
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints
Mingjie Pan
Jiyao Zhang
Tianshu Wu
Yinghao Zhao
Wenlong Gao
Hao Dong
LM&Ro
119
13
0
08 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
177
15
0
06 Jan 2025
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation
Kun Wu
Chengkai Hou
Jiaming Liu
Zhengping Che
Xiaozhu Ju
...
Zhenyu Wang
Pengju An
Siyuan Qian
Shanghang Zhang
Jian Tang
LM&Ro
242
24
0
18 Dec 2024
RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
Weixin Mao
Weiheng Zhong
Zhou Jiang
Dong Fang
Zhongyue Zhang
...
Fan Jia
Tiancai Wang
Haoqiang Fan
Osamu Yoshie
Osamu Yoshie
235
7
0
29 Nov 2024
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation via Safety-as-Policy
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation via Safety-as-Policy
Minheng Ni
Lei Zhang
Zhaoyu Chen
Lefei Zhang
Wangmeng Zuo
Jianwei Zhang
Lei Zhang
W. Zuo
128
1
0
27 Nov 2024
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Jiange Yang
Haoyi Zhu
Yanjie Wang
Gangshan Wu
Tong He
Limin Wang
204
3
0
21 Nov 2024
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Taowen Wang
Dongfang Liu
James Liang
Wenhao Yang
Qifan Wang
Cheng Han
Jiebo Luo
Ruixiang Tang
Ruixiang Tang
AAML
184
6
0
18 Nov 2024
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Qingwen Bu
Hongyang Li
Li Chen
Jisong Cai
Jia Zeng
Heming Cui
Maoqing Yao
Yu Qiao
159
11
0
10 Oct 2024
LADEV: A Language-Driven Testing and Evaluation Platform for
  Vision-Language-Action Models in Robotic Manipulation
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
Zhijie Wang
Zhehua Zhou
Jiayang Song
Yuheng Huang
Zhan Shu
Lei Ma
90
1
0
07 Oct 2024
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
Yangtao Chen
Zixuan Chen
Junhui Yin
Jing Huo
Pinzhuo Tian
Jieqi Shi
Yang Gao
LM&Ro
148
3
0
30 Sep 2024
HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers
HiRT: Enhancing Robotic Control with Hierarchical Robot Transformers
Jianke Zhang
Yanjiang Guo
Xiaoyu Chen
Yen-Jen Wang
Yucheng Hu
Chengming Shi
Jianyu Chen
94
13
0
12 Sep 2024
RoboCAS: A Benchmark for Robotic Manipulation in Complex Object Arrangement Scenarios
RoboCAS: A Benchmark for Robotic Manipulation in Complex Object Arrangement Scenarios
Liming Zheng
Feng Yan
Fanfan Liu
Chengjian Feng
Zhuoliang Kang
Lin Ma
140
2
0
09 Jul 2024
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou
Teli Ma
Kun-Yu Lin
Ronghe Qiu
Zifan Wang
Junwei Liang
151
7
0
20 Jun 2024
Grounding Multimodal Large Language Models in Actions
Grounding Multimodal Large Language Models in Actions
Andrew Szot
Bogdan Mazoure
Harsh Agrawal
Devon Hjelm
Z. Kira
Alexander Toshev
LM&Ro
88
14
0
12 Jun 2024
Learning Manipulation by Predicting Interaction
Learning Manipulation by Predicting Interaction
Jia Zeng
Qingwen Bu
Bangjun Wang
Wenke Xia
Li Chen
...
Heming Cui
Bin Zhao
Xuelong Li
Yu Qiao
Hongyang Li
134
26
0
01 Jun 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
335
54
0
23 May 2024
Learning Manipulation Skills through Robot Chain-of-Thought with Sparse Failure Guidance
Learning Manipulation Skills through Robot Chain-of-Thought with Sparse Failure Guidance
Kaifeng Zhang
Zhao-Heng Yin
Weirui Ye
Yang Gao
159
4
0
22 May 2024
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control
From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control
Yide Shentu
Philipp Wu
Aravind Rajeswaran
Pieter Abbeel
98
15
0
08 May 2024
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
Dingzhe Li
Yixiang Jin
A. Yong
Yong A
Hongze Yu
...
Huaping Liu
Gang Hua
F. Sun
Jianwei Zhang
Bin Fang
AI4CELM&Ro
222
15
0
28 Apr 2024
Physical Backdoor Attack can Jeopardize Driving with
  Vision-Large-Language Models
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models
Zhenyang Ni
Rui Ye
Yuxian Wei
Zhen Xiang
Yanfeng Wang
Siheng Chen
AAML
98
13
0
19 Apr 2024
GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped
  Robot
GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot
Wenxuan Song
Han Zhao
Pengxiang Ding
Can Cui
Shangke Lyu
Yaning Fan
Donglin Wang
OffRL
120
14
0
20 Mar 2024
Human Demonstrations are Generalizable Knowledge for Robots
Human Demonstrations are Generalizable Knowledge for Robots
Te Cui
Guangyan Chen
Tianxing Zhou
Zicai Peng
Mengxiao Hu
Haoyang Lu
Haizhou Li
Meiling Wang
Yi Yang
Yufeng Yue
LM&Ro
94
6
0
05 Dec 2023
1