ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.05800
  4. Cited By
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks

3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks

9 May 2025
V. Bhat
Yu-Hsiang Lan
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
ArXiv (abs)PDFHTML

Papers citing "3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks"

7 / 7 papers shown
Title
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
Qingqing Zhao
Yao Lu
Moo Jin Kim
Zipeng Fu
Zhuoyang Zhang
...
Ankur Handa
Xuan Li
Donglai Xiang
Gordon Wetzstein
Nayeon Lee
LM&RoLRM
99
33
0
27 Mar 2025
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
Huang Huang
Fangchen Liu
Letian Fu
Tingfan Wu
Mustafa Mukadam
Jitendra Malik
Ken Goldberg
Pieter Abbeel
LM&RoVLM
184
10
0
05 Mar 2025
RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training
RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training
Raktim Gautam Goswami
Prashanth Krishnamurthy
Yann LeCun
Farshad Khorrami
161
1
0
26 Nov 2024
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Jiange Yang
Haoyi Zhu
Yanjie Wang
Gangshan Wu
Tong He
Limin Wang
204
3
0
21 Nov 2024
CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision
CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision
Gi-Cheon Kang
Junghyun Kim
Kyuhwan Shim
Jun Ki Lee
Byoung-Tak Zhang
LM&Ro
327
2
1
01 Nov 2024
ReVLA: Reverting Visual Domain Limitation of Robotic Foundation Models
ReVLA: Reverting Visual Domain Limitation of Robotic Foundation Models
Sombit Dey
Jan-Nico Zaech
Nikolay Nikolov
Luc Van Gool
Danda Pani Paudel
MoMeVLM
156
5
0
23 Sep 2024
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Peiyuan Zhi
Zhiyuan Zhang
Muzhi Han
Zeyu Zhang
Zhitian Li
Ziyuan Jiao
Ziyuan Jiao
Siyuan Huang
Siyuan Huang
LRMLM&Ro
121
33
0
16 Apr 2024
1