Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.07511
Cited By
PointVLA: Injecting the 3D World into Vision-Language-Action Models
10 March 2025
Chengmeng Li
Junjie Wen
Yan Peng
Chaomin Shen
Feifei Feng
Yinlin Zhu
3DPC
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PointVLA: Injecting the 3D World into Vision-Language-Action Models"
24 / 24 papers shown
Title
ChatVLA-2: Vision-Language-Action Model with Open-World Embodied Reasoning from Pretrained Knowledge
Zhongyi Zhou
Yichen Zhu
Junjie Wen
Chaomin Shen
Yi Xu
LM&Ro
LRM
VLM
77
0
0
28 May 2025
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization
Jiaming Zhou
Ke Ye
Jiayi Liu
Teli Ma
Zifang Wang
Ronghe Qiu
Kun-Yu Lin
Zhilin Zhao
Junwei Liang
81
2
0
21 May 2025
EmbodiedMAE: A Unified 3D Multi-Modal Representation for Robot Manipulation
Zibin Dong
Fei Ni
Yifu Yuan
Yinchuan Li
Jianye Hao
107
0
0
15 May 2025
CLTP: Contrastive Language-Tactile Pre-training for 3D Contact Geometry Understanding
Wenxuan Ma
Xiaoge Cao
Yize Zhang
Chaofan Zhang
Shaobo Yang
Peng Hao
Bin Fang
Yinghao Cai
Shaowei Cui
Shuo Wang
96
0
0
13 May 2025
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges
Ranjan Sapkota
Yang Cao
Konstantinos I. Roumeliotis
Manoj Karkee
LM&Ro
383
2
0
07 May 2025
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Taiyi Wang
Zhihao Wu
Jianheng Liu
Jianye Hao
Jun Wang
Kun Shao
OffRL
96
29
0
24 Feb 2025
VLAS: Vision-Language-Action Model With Speech Instructions For Customized Robot Manipulation
Wei Zhao
Pengxiang Ding
Hao Fei
Zhefei Gong
Shuanghao Bai
Han Zhao
Donglin Wang
139
11
0
24 Feb 2025
Qwen2.5-VL Technical Report
S. Bai
Keqin Chen
Xuejing Liu
Jialin Wang
Wenbin Ge
...
Zesen Cheng
Hang Zhang
Zhibo Yang
Haiyang Xu
Junyang Lin
VLM
322
685
0
20 Feb 2025
DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control
Junjie Wen
Yinlin Zhu
Jinming Li
Zhibin Tang
Yaxin Peng
Feifei Feng
VLM
108
27
0
09 Feb 2025
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Karl Pertsch
Kyle Stachowicz
Brian Ichter
Danny Driess
Suraj Nair
Q. Vuong
Oier Mees
Chelsea Finn
Sergey Levine
132
70
0
17 Jan 2025
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation
Kun Wu
Yichen Zhu
Jinming Li
Junjie Wen
Ning Liu
Zhiyuan Xu
Qinru Qiu
148
7
0
27 Sep 2024
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Chenming Zhu
Tai Wang
Wenwei Zhang
Jiangmiao Pang
Xihui Liu
199
53
0
26 Sep 2024
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Junjie Wen
Yinlin Zhu
Jinming Li
Minjie Zhu
Kun Wu
...
Ran Cheng
Yaxin Peng
Chaomin Shen
Feifei Feng
Jian Tang
LM&Ro
116
68
0
19 Sep 2024
Robotic Control via Embodied Chain-of-Thought Reasoning
Michał Zawalski
William Chen
Karl Pertsch
Oier Mees
Chelsea Finn
Sergey Levine
LRM
LM&Ro
123
88
0
11 Jul 2024
DCPI-Depth: Explicitly Infusing Dense Correspondence Prior to Unsupervised Monocular Depth Estimation
Mengtan Zhang
Yi Feng
Qijun Chen
Rui Fan
MDE
130
6
0
27 May 2024
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Han Zhao
Min Zhang
Wei Zhao
Pengxiang Ding
Siteng Huang
Donglin Wang
Mamba
101
74
0
21 Mar 2024
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
...
Thomas Kollar
Sergey Levine
Chelsea Finn
Sergey Levine
Chelsea Finn
223
221
0
19 Mar 2024
GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature Fields
Yanjie Ze
Ge Yan
Yueh-hua Wu
Annabella Macaluso
Yuying Ge
Jianglong Ye
Nicklas Hansen
Li Erran Li
Xinyu Wang
DiffM
AI4CE
77
83
0
31 Aug 2023
RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot
Haoshu Fang
Hongjie Fang
Zhenyu Tang
Jirong Liu
Chenxi Wang
Junbo Wang
Haoyi Zhu
Cewu Lu
105
73
0
02 Jul 2023
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
560
4,861
0
17 Apr 2023
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Cheng Chi
Zhenjia Xu
S. Feng
Eric A. Cousineau
Yilun Du
Benjamin Burchfiel
Russ Tedrake
Shuran Song
347
1,189
0
07 Mar 2023
CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
Oier Mees
Lukás Hermann
Erick Rosete-Beas
Wolfram Burgard
LM&Ro
105
258
0
06 Dec 2021
Sliced Recursive Transformer
Zhiqiang Shen
Zechun Liu
Eric P. Xing
ViT
49
27
0
09 Nov 2021
Fast Online Adaptation in Robotics through Meta-Learning Embeddings of Simulated Priors
Rituraj Kaushik
Timothée Anne
Jean-Baptiste Mouret
75
53
0
10 Mar 2020
1