ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.14093
  4. Cited By
A Survey on Vision-Language-Action Models for Embodied AI

A Survey on Vision-Language-Action Models for Embodied AI

23 May 2024
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
    LM&Ro
ArXivPDFHTML

Papers citing "A Survey on Vision-Language-Action Models for Embodied AI"

16 / 66 papers shown
Title
A Persistent Spatial Semantic Representation for High-level Natural
  Language Instruction Execution
A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution
Valts Blukis
Chris Paxton
D. Fox
Animesh Garg
Yoav Artzi
LM&Ro
212
133
0
12 Jul 2021
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
225
898
0
28 Apr 2021
ManipulaTHOR: A Framework for Visual Object Manipulation
ManipulaTHOR: A Framework for Visual Object Manipulation
Kiana Ehsani
Winson Han
Alvaro Herrasti
Eli VanderBilt
Luca Weihs
Eric Kolve
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
171
124
0
22 Apr 2021
High-Performance Large-Scale Image Recognition Without Normalization
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
223
512
0
11 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
298
3,700
0
11 Feb 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
227
2,430
0
04 Jan 2021
SAPIEN: A SimulAted Part-based Interactive ENvironment
SAPIEN: A SimulAted Part-based Interactive ENvironment
Fanbo Xiang
Yuzhe Qin
Kaichun Mo
Yikuan Xia
Hao Zhu
...
He-Nan Wang
Li Yi
Angel X. Chang
Leonidas J. Guibas
Hao Su
218
487
0
19 Mar 2020
Reasoning Over Semantic-Level Graph for Fact Checking
Reasoning Over Semantic-Level Graph for Fact Checking
Wanjun Zhong
Jingjing Xu
Duyu Tang
Zenan Xu
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
HILM
GNN
182
165
0
09 Sep 2019
Feature Pyramid Networks for Object Detection
Feature Pyramid Networks for Object Detection
Tsung-Yi Lin
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
183
21,813
0
09 Dec 2016
PointNet: Deep Learning on Point Sets for 3D Classification and
  Segmentation
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas J. Guibas
3DH
3DPC
3DV
PINN
222
14,103
0
02 Dec 2016
Aggregated Residual Transformations for Deep Neural Networks
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
297
10,220
0
16 Nov 2016
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image
  Segmentation
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Vijay Badrinarayanan
Alex Kendall
R. Cipolla
SSeg
446
15,639
0
02 Nov 2015
You Only Look Once: Unified, Real-Time Object Detection
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon
S. Divvala
Ross B. Girshick
Ali Farhadi
ObjD
289
36,320
0
08 Jun 2015
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
294
75,800
0
18 May 2015
Convolutional Neural Networks for Sentence Classification
Convolutional Neural Networks for Sentence Classification
Yoon Kim
AILaw
VLM
255
13,364
0
25 Aug 2014
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
239
31,257
0
16 Jan 2013
Previous
12