Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.14093
Cited By
A Survey on Vision-Language-Action Models for Embodied AI
23 May 2024
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Survey on Vision-Language-Action Models for Embodied AI"
16 / 66 papers shown
Title
A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution
Valts Blukis
Chris Paxton
D. Fox
Animesh Garg
Yoav Artzi
LM&Ro
212
133
0
12 Jul 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
225
898
0
28 Apr 2021
ManipulaTHOR: A Framework for Visual Object Manipulation
Kiana Ehsani
Winson Han
Alvaro Herrasti
Eli VanderBilt
Luca Weihs
Eric Kolve
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
171
124
0
22 Apr 2021
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
223
512
0
11 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
298
3,700
0
11 Feb 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,430
0
04 Jan 2021
SAPIEN: A SimulAted Part-based Interactive ENvironment
Fanbo Xiang
Yuzhe Qin
Kaichun Mo
Yikuan Xia
Hao Zhu
...
He-Nan Wang
Li Yi
Angel X. Chang
Leonidas J. Guibas
Hao Su
218
487
0
19 Mar 2020
Reasoning Over Semantic-Level Graph for Fact Checking
Wanjun Zhong
Jingjing Xu
Duyu Tang
Zenan Xu
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
HILM
GNN
182
165
0
09 Sep 2019
Feature Pyramid Networks for Object Detection
Tsung-Yi Lin
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
183
21,813
0
09 Dec 2016
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas J. Guibas
3DH
3DPC
3DV
PINN
222
14,103
0
02 Dec 2016
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
297
10,220
0
16 Nov 2016
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Vijay Badrinarayanan
Alex Kendall
R. Cipolla
SSeg
446
15,639
0
02 Nov 2015
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
289
36,320
0
08 Jun 2015
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
294
75,800
0
18 May 2015
Convolutional Neural Networks for Sentence Classification
Yoon Kim
AILaw
VLM
255
13,364
0
25 Aug 2014
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
239
31,257
0
16 Jan 2013
Previous
1
2