Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.05540
Cited By
v1
v2 (latest)
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
8 May 2025
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Harshvardhan Sikka
Paul Pu Liang
LM&Ro
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments"
13 / 13 papers shown
Title
An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Jaewoo Song
Harshvardhan Sikka
50
0
0
10 Jun 2025
Interleave-VLA: Enhancing Robot Manipulation with Interleaved Image-Text Instructions
Cunxin Fan
Xiaosong Jia
Yihang Sun
Yixiao Wang
Jianglan Wei
...
Xiangyu Zhao
Masayoshi Tomizuka
Xue Yang
Junchi Yan
Mingyu Ding
LM&Ro
VLM
112
10
0
04 May 2025
GENMO: A GENeralist Model for Human MOtion
Jiefeng Li
Jinkun Cao
Haotian Zhang
Davis Rempe
Jan Kautz
Umar Iqbal
Ye Yuan
DiffM
VGen
100
1
0
02 May 2025
Latent Diffusion Planning for Imitation Learning
Amber Xie
Oleh Rybkin
Dorsa Sadigh
Chelsea Finn
113
1
0
23 Apr 2025
π
0.5
π_{0.5}
π
0.5
: a Vision-Language-Action Model with Open-World Generalization
Physical Intelligence
Kevin Black
Noah Brown
James Darpinian
Karan Dhabalia
...
Homer Walke
Anna Walling
Haohuan Wang
Lili Yu
Ury Zhilinsky
LM&Ro
VLM
142
51
0
22 Apr 2025
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
Qingqing Zhao
Yao Lu
Moo Jin Kim
Zipeng Fu
Zhuoyang Zhang
...
Ankur Handa
Xuan Li
Donglai Xiang
Gordon Wetzstein
Nayeon Lee
LM&Ro
LRM
99
33
0
27 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
164
68
0
18 Mar 2025
Magma: A Foundation Model for Multimodal AI Agents
Jianwei Yang
Reuben Tan
Qianhui Wu
Ruijie Zheng
Baolin Peng
...
Seonghyeon Ye
Joel Jang
Yuquan Deng
Lars Liden
Jianfeng Gao
VLM
AI4TS
175
18
0
18 Feb 2025
A Comprehensive Survey of Agents for Computer Use: Foundations, Challenges, and Future Directions
Pascal Sager
Benjamin Meyer
Peng Yan
Rebekka von Wartburg-Kottler
Layan Etaiwi
Aref Enayati
Gabriel Nobel
Ahmed Abdulkadir
Benjamin Grewe
Thilo Stadelmann
LLMAG
67
8
0
27 Jan 2025
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Karl Pertsch
Kyle Stachowicz
Brian Ichter
Danny Driess
Suraj Nair
Q. Vuong
Oier Mees
Chelsea Finn
Sergey Levine
161
70
0
17 Jan 2025
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Frank F. Xu
Yufan Song
Boxuan Li
Yuxuan Tang
Kritanjali Jain
...
Wayne Chi
Lawrence Jang
Yiqing Xie
Shuyan Zhou
Graham Neubig
LLMAG
204
42
0
18 Dec 2024
Latent Action Pretraining from Videos
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
...
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
171
45
0
15 Oct 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
337
54
0
23 May 2024
1