Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.11699
Cited By
Retrieval-Augmented Embodied Agents
17 April 2024
Yichen Zhu
Zhicai Ou
Xiaofeng Mou
Jian Tang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Retrieval-Augmented Embodied Agents"
14 / 14 papers shown
Title
How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game
Z. Wang
Yurui Dong
Fuwen Luo
Minyuan Ruan
Zhili Cheng
C. L. P. Chen
Peng Li
Yang Liu
LRM
87
0
0
13 Mar 2025
ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration
Minjie Zhu
Y. X. Zhu
Jinming Li
Zhongyi Zhou
Junjie Wen
Xiaoyu Liu
Chaomin Shen
Yaxin Peng
Feifei Feng
LM&Ro
86
3
0
26 Feb 2025
Large Language Models for Multi-Robot Systems: A Survey
Peihan Li
Zijian An
Shams Abrar
Lifeng Zhou
LM&Ro
LRM
62
6
0
06 Feb 2025
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
Junjie Wen
Minjie Zhu
Y. X. Zhu
Zhibin Tang
Jinming Li
...
Chengmeng Li
Xiaoyu Liu
Yaxin Peng
Chaomin Shen
Feifei Feng
88
15
0
04 Dec 2024
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation
Kun Wu
Yichen Zhu
Jinming Li
Junjie Wen
Ning Liu
Zhiyuan Xu
Qinru Qiu
42
4
0
27 Sep 2024
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Junjie Wen
Y. X. Zhu
Jinming Li
Minjie Zhu
Kun Wu
...
Ran Cheng
Chaomin Shen
Yaxin Peng
Feifei Feng
Jian Tang
LM&Ro
72
41
0
19 Sep 2024
R+X: Retrieval and Execution from Everyday Human Videos
Georgios Papagiannis
Norman Di Palo
Pietro Vitiello
Edward Johns
56
15
0
17 Jul 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,244
0
30 Jan 2023
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Aviral Kumar
Anika Singh
F. Ebert
Mitsuhiko Nakamoto
Yanlai Yang
Chelsea Finn
Sergey Levine
OffRL
OnRL
123
66
0
11 Oct 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
125
161
0
29 Sep 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
D. Fox
LM&Ro
161
457
0
12 Sep 2022
Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets
F. Ebert
Yanlai Yang
Karl Schmeckpeper
Bernadette Bucher
G. Georgakis
Kostas Daniilidis
Chelsea Finn
Sergey Levine
169
219
0
27 Sep 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
317
5,775
0
29 Apr 2021
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas J. Guibas
3DH
3DPC
3DV
PINN
222
14,103
0
02 Dec 2016
1