Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.20095
Cited By
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
28 June 2024
Xiang Li
Cristina Mata
J. Park
Kumara Kahatapitiya
Yoo Sung Jang
Jinghuan Shang
Kanchana Ranasinghe
R. Burgert
Mu Cai
Yong Jae Lee
Michael S. Ryoo
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLaRA: Supercharging Robot Learning Data for Vision-Language Policy"
32 / 32 papers shown
Title
GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation
Abhay Deshpande
Yuquan Deng
Arijit Ray
Jordi Salvador
Winson Han
Jiafei Duan
Kuo-Hao Zeng
Yuke Zhu
Ranjay Krishna
Rose Hendrix
37
0
0
19 May 2025
From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation
Yifu Yuan
Haiqin Cui
Yibin Chen
Zibin Dong
Fei Ni
Longxin Kou
Jinyi Liu
Pengyi Li
Yan Zheng
Jianye Hao
50
0
0
13 May 2025
Pixel Motion as Universal Representation for Robot Control
Kanchana Ranasinghe
Xiang Li
Cristina Mata
J. Park
Michael S. Ryoo
VGen
44
0
0
12 May 2025
π
0.5
π_{0.5}
π
0.5
: a Vision-Language-Action Model with Open-World Generalization
Physical Intelligence
Kevin Black
Noah Brown
James Darpinian
Karan Dhabalia
...
Homer Walke
Anna Walling
Haohuan Wang
Lili Yu
Ury Zhilinsky
LM&Ro
VLM
72
25
0
22 Apr 2025
Towards Fast, Memory-based and Data-Efficient Vision-Language Policy
Haoxuan Li
Sixu Yan
Yongqian Li
Xinggang Wang
LM&Ro
91
0
0
13 Mar 2025
RoboDesign1M: A Large-scale Dataset for Robot Design Understanding
T. H. Le
T. H. Nguyen
Quang-Dieu Tran
Quang Minh Nguyen
Baoru Huang
Hoan Nguyen
M. Vu
Tung D. Ta
A. Nguyen
3DV
100
0
0
09 Mar 2025
Teaching Metric Distance to Autoregressive Multimodal Foundational Models
Jiwan Chung
Saejin Kim
Yongrae Jo
Jinho Park
Dongjun Min
Youngjae Yu
120
0
0
04 Mar 2025
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
Jiarui Zhang
Mahyar Khayatkhoei
P. Chhikara
Filip Ilievski
LRM
54
10
0
24 Feb 2025
Pre-training Auto-regressive Robotic Models with 4D Representations
Dantong Niu
Yuvan Sharma
Haoru Xue
Giscard Biamby
Junyi Zhang
Ziteng Ji
Trevor Darrell
Roei Herzig
95
1
0
18 Feb 2025
Magma: A Foundation Model for Multimodal AI Agents
Jianwei Yang
Reuben Tan
Qianhui Wu
Ruijie Zheng
Baolin Peng
...
Seonghyeon Ye
Joel Jang
Yuquan Deng
Lars Liden
Jianfeng Gao
VLM
AI4TS
131
11
0
18 Feb 2025
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Taowen Wang
Dongfang Liu
James Liang
Wenhao Yang
Qifan Wang
Cheng Han
Jiebo Luo
Ruixiang Tang
Ruixiang Tang
AAML
111
6
0
18 Nov 2024
A Dual Process VLA: Efficient Robotic Manipulation Leveraging VLM
ByungOk Han
Jaehong Kim
Jinhyeok Jang
61
2
0
21 Oct 2024
In-Context Learning Enables Robot Action Prediction in LLMs
Yida Yin
Zekai Wang
Yuvan Sharma
Dantong Niu
Trevor Darrell
Roei Herzig
LM&Ro
145
4
0
16 Oct 2024
Latent Action Pretraining from Videos
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
...
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
51
34
0
15 Oct 2024
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
Zhijie Wang
Zhehua Zhou
Jiayang Song
Yuheng Huang
Zhan Shu
Lei Ma
45
0
0
07 Oct 2024
AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation
Jiafei Duan
Wilbert Pumacay
Nishanth Kumar
Yi Ru Wang
Shulin Tian
Wentao Yuan
Ranjay Krishna
Dieter Fox
Ajay Mandlekar
Yijie Guo
VLM
LRM
82
23
0
01 Oct 2024
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation
Kun Wu
Yichen Zhu
Jinming Li
Junjie Wen
Ning Liu
Zhiyuan Xu
Qinru Qiu
91
6
0
27 Sep 2024
CLSP: High-Fidelity Contrastive Language-State Pre-training for Agent State Representation
Fuxian Huang
Qi Zhang
Shaopeng Zhai
Jie Wang
Tianyi Zhang
Haoran Zhang
Ming Zhou
Yu Liu
Yu Qiao
CLIP
AI4TS
43
0
0
24 Sep 2024
Manipulation Facing Threats: Evaluating Physical Vulnerabilities in End-to-End Vision Language Action Models
Hao Cheng
Erjia Xiao
Chengyuan Yu
Zhao Yao
Jiahang Cao
...
Jiaxu Wang
Mengshu Sun
Kaidi Xu
Jindong Gu
Renjing Xu
AAML
48
3
0
20 Sep 2024
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Junjie Wen
Yinlin Zhu
Jinming Li
Minjie Zhu
Kun Wu
...
Ran Cheng
Yaxin Peng
Chaomin Shen
Feifei Feng
Jian Tang
LM&Ro
86
57
0
19 Sep 2024
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
Jinghuan Shang
Karl Schmeckpeper
Brandon B. May
M. Minniti
Tarik Kelestemur
David Watkins
Laura Herlant
VLM
48
23
0
29 Jul 2024
GeoChat: Grounded Large Vision-Language Model for Remote Sensing
Kartik Kuckreja
M. S. Danish
Muzammal Naseer
Abhijit Das
Salman Khan
Fahad Shahbaz Khan
51
145
0
24 Nov 2023
Ferret: Refer and Ground Anything Anywhere at Any Granularity
Haoxuan You
Haotian Zhang
Zhe Gan
Xianzhi Du
Bowen Zhang
Zirui Wang
Liangliang Cao
Shih-Fu Chang
Yinfei Yang
ObjD
MLLM
VLM
49
314
0
11 Oct 2023
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shilong Zhang
Pei Sun
Shoufa Chen
Min Xiao
Wenqi Shao
Wenwei Zhang
Yu Liu
Kai-xiang Chen
Ping Luo
VLM
MLLM
109
231
0
07 Jul 2023
Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai
Basil Mustafa
Alexander Kolesnikov
Lucas Beyer
CLIP
VLM
54
1,028
0
27 Mar 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
303
13,788
0
15 Mar 2023
Faithful Reasoning Using Large Language Models
Antonia Creswell
Murray Shanahan
ReLM
LRM
43
123
0
30 Aug 2022
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
135
798
0
12 May 2022
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
Michael Ahn
Anthony Brohan
Noah Brown
Yevgen Chebotar
Omar Cortes
...
Ted Xiao
Peng Xu
Sichun Xu
Mengyuan Yan
Andy Zeng
LM&Ro
93
1,901
0
04 Apr 2022
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Andy Zeng
Maria Attarian
Brian Ichter
K. Choromanski
Adrian S. Wong
...
Michael S. Ryoo
Vikas Sindhwani
Johnny Lee
Vincent Vanhoucke
Peter R. Florence
ReLM
LRM
100
577
0
01 Apr 2022
Online Decision Transformer
Qinqing Zheng
Amy Zhang
Aditya Grover
OffRL
36
205
0
11 Feb 2022
Predicting Video with VQVAE
Jacob Walker
Ali Razavi
Aaron van den Oord
DRL
46
67
0
02 Mar 2021
1