Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.10872
Cited By
v1
v2 (latest)
REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?
16 May 2025
Chenxi Jiang
Chuhao Zhou
Jianfei Yang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"REI-Bench: Can Embodied Agents Understand Vague Human Instructions in Task Planning?"
22 / 22 papers shown
Title
Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions
Jin Gao
Lei Gan
Yuankai Li
Yixin Ye
Dequan Wang
52
3
0
02 Aug 2024
Gemma 2: Improving Open Language Models at a Practical Size
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
...
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
VLM
MoE
OSLM
141
915
0
31 Jul 2024
QT-TDM: Planning with Transformer Dynamics Model and Autoregressive Q-Learning
Mostafa Kotb
C. Weber
Muhammad Burhan Hafez
Stefan Wermter
93
1
0
26 Jul 2024
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation
Jiaming Liu
Mengzhen Liu
Zhenyu Wang
Lily Lee
Kaichen Zhou
Pengju An
Senqiao Yang
Renrui Zhang
Yandong Guo
Shanghang Zhang
LM&Ro
LRM
Mamba
90
18
0
06 Jun 2024
Aligning Language Models to Explicitly Handle Ambiguity
Sungmin Cho
Youna Kim
Cheonbok Park
Junyeob Kim
Choonghyun Park
Kang Min Yoo
Sang-goo Lee
Taeuk Kim
64
22
0
18 Apr 2024
A Taxonomy of Ambiguity Types for NLP
Margaret Li
Alisa Liu
Zhaofeng Wu
Noah A. Smith
UQLM
44
2
0
21 Mar 2024
LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents
Jae-Woo Choi
Youngwoo Yoon
Hyobin Ong
Jaehong Kim
Minsu Jang
51
18
0
13 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
167
1,287
0
05 Feb 2024
MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models
Xin Liu
Yichen Zhu
Jindong Gu
Yunshi Lan
Chao Yang
Yu Qiao
107
109
0
29 Nov 2023
An Embodied Generalist Agent in 3D World
Jiangyong Huang
Silong Yong
Xiaojian Ma
Xiongkun Linghu
Puhao Li
Yan Wang
Qing Li
Song-Chun Zhu
Baoxiong Jia
Siyuan Huang
LM&Ro
94
175
0
18 Nov 2023
Vision-Language Foundation Models as Effective Robot Imitators
Xinghang Li
Minghuan Liu
Hanbo Zhang
Cunjun Yu
Jie Xu
...
Ya Jing
Weinan Zhang
Huaping Liu
Hang Li
Tao Kong
LM&Ro
113
170
0
02 Nov 2023
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
...
Ted Xiao
Peng Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
LRM
192
1,291
0
28 Jul 2023
We're Afraid Language Models Aren't Modeling Ambiguity
Alisa Liu
Zhaofeng Wu
Julian Michael
Alane Suhr
Peter West
Alexander Koller
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
119
104
0
27 Apr 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.5K
13,472
0
27 Feb 2023
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
Dieter Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
175
657
0
22 Sep 2022
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
Michael Ahn
Anthony Brohan
Noah Brown
Yevgen Chebotar
Omar Cortes
...
Ted Xiao
Peng Xu
Sichun Xu
Mengyuan Yan
Andy Zeng
LM&Ro
195
1,988
0
04 Apr 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
850
9,683
0
28 Jan 2022
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
Wenlong Huang
Pieter Abbeel
Deepak Pathak
Igor Mordatch
LM&Ro
104
1,125
0
18 Jan 2022
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
889
42,463
0
28 May 2020
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
Mohit Shridhar
Jesse Thomason
Daniel Gordon
Yonatan Bisk
Winson Han
Roozbeh Mottaghi
Luke Zettlemoyer
Dieter Fox
LM&Ro
122
781
0
03 Dec 2019
VirtualHome: Simulating Household Activities via Programs
Xavier Puig
K. Ra
Marko Boben
Jiaman Li
Tingwu Wang
Sanja Fidler
Antonio Torralba
LM&Ro
100
500
0
19 Jun 2018
AI2-THOR: An Interactive 3D Environment for Visual AI
Eric Kolve
Roozbeh Mottaghi
Winson Han
Eli VanderBilt
Luca Weihs
...
Daniel Gordon
Yuke Zhu
Aniruddha Kembhavi
Abhinav Gupta
Ali Farhadi
LM&Ro
86
1,111
0
14 Dec 2017
1