Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.09082
Cited By
v1
v2
v3 (latest)
Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method
12 December 2024
Xinshuai Song
Weixing Chen
Yang Liu
Weikai Chen
Guanbin Li
Liang Lin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method"
50 / 51 papers shown
Title
Generating Vision-Language Navigation Instructions Incorporated Fine-Grained Alignment Annotations
Yibo Cui
Liang Xie
Yu Zhao
Jiawei Sun
Erwei Yin
17
0
0
10 Jun 2025
Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents
Kaivalya Hariharan
Uzay Girit
Atticus Wang
Jacob Andreas
LLMAG
LRM
21
0
0
30 May 2025
A Survey of Robotic Navigation and Manipulation with Physics Simulators in the Era of Embodied AI
Lik Hang Kenny Wong
Xueyang Kang
Kaixin Bai
Jianwei Zhang
144
0
0
01 May 2025
Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
Kaixuan Jiang
Yang Liu
Weixing Chen
Jingzhou Luo
Ziliang Chen
Ling Pan
G. Li
Liang Lin
106
4
0
14 Mar 2025
DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering
Jingzhou Luo
Yang Liu
Weixing Chen
Zhen Li
Yansen Wang
G. Li
Liang Lin
129
3
0
05 Mar 2025
Cross-modal Causal Relation Alignment for Video Question Grounding
Weixing Chen
Yang Liu
Binglin Chen
Jiandong Su
Yongsen Zheng
Liang Lin
BDL
VGen
CML
122
2
0
05 Mar 2025
EvoAgent: Agent Autonomous Evolution with Continual World Model for Long-Horizon Tasks
Tongtong Feng
X. Wang
Zekai Zhou
Ren Wang
Yuwei Zhan
Guangyao Li
Qing Li
Wenwu Zhu
LM&Ro
166
0
0
09 Feb 2025
InfiniteWorld: A Unified Scalable Simulation Framework for General Visual-Language Robot Interaction
Pengzhen Ren
Mingxing Li
Zhen Luo
Xinshuai Song
Zhenpeng Chen
...
Changfei Fu
Yang Liu
Liang Lin
Feng Zheng
Xiaodan Liang
LM&Ro
VGen
168
11
0
08 Dec 2024
RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
Maya Varma
Jean-Benoit Delbrouck
Zhihong Chen
Akshay S. Chaudhari
C. Langlotz
VLM
116
9
0
06 Nov 2024
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
Xinyu Wang
Donglin Yang
Ziqin Wang
Hohin Kwan
Jinyu Chen
wenjun wu
Hongsheng Li
Yue Liao
Si Liu
83
18
0
09 Oct 2024
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models
Gengze Zhou
Yicong Hong
Zun Wang
Xin Eric Wang
Qi Wu
LM&Ro
96
30
0
17 Jul 2024
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Yang Liu
Weixing Chen
Yongjie Bai
Xiaodan Liang
Guanbin Li
Wen Gao
Liang Lin
LM&Ro
SyDa
AI4CE
155
69
0
09 Jul 2024
Embodied Instruction Following in Unknown Environments
Zhenyu Wu
Ziwei Wang
Xiuwei Xu
Hang Yin
Yinan Liang
Angyuan Ma
Jiwen Lu
Haibin Yan
LM&Ro
99
4
0
17 Jun 2024
InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment
Yuxing Long
Wenzhe Cai
Hongcheng Wang
Guanqi Zhan
Hao Dong
118
34
0
07 Jun 2024
GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation
Mukul Khanna
Ram Ramrakhya
Gunjan Chhablani
Sriram Yenamandra
Théophile Gervet
Matthew Chang
Z. Kira
Devendra Singh Chaplot
Dhruv Batra
Roozbeh Mottaghi
LM&Ro
126
34
0
09 Apr 2024
Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation
Zihan Wang
Xiangyang Li
Jiahao Yang
Yeqi Liu
Junjie Hu
Ming Jiang
Shuqiang Jiang
98
19
0
02 Apr 2024
Volumetric Environment Representation for Vision-Language Navigation
Rui Liu
Wenguan Wang
Yi Yang
91
30
0
21 Mar 2024
BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation
Chengshu Li
Ruohan Zhang
J. Wong
Cem Gokmen
S. Srivastava
...
Silvio Savarese
H. Gweon
Chenxi Liu
Jiajun Wu
Fei-Fei Li
VGen
LM&Ro
VLM
77
40
0
14 Mar 2024
NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation
JIazhao Zhang
Kunyu Wang
Rongtao Xu
Gengze Zhou
Yicong Hong
Xiaomeng Fang
Qi Wu
Zhizheng Zhang
Wang He
LM&Ro
158
61
0
24 Feb 2024
MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World
Yining Hong
Zishuo Zheng
Peihao Chen
Yian Wang
Junyan Li
Chuang Gan
90
37
0
16 Jan 2024
Holodeck: Language Guided Generation of 3D Embodied AI Environments
Yue Yang
Fan-Yun Sun
Luca Weihs
Eli VanderBilt
Alvaro Herrasti
...
Lingjie Liu
Chris Callison-Burch
Mark Yatskar
Aniruddha Kembhavi
Christopher Clark
LM&Ro
131
92
0
14 Dec 2023
Towards Learning a Generalist Model for Embodied Navigation
Duo Zheng
Shijia Huang
Lin Zhao
Yiwu Zhong
Liwei Wang
LM&Ro
149
61
0
04 Dec 2023
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
...
Jilan Xu
Guo Chen
Ping Luo
Limin Wang
Yu Qiao
VLM
MLLM
166
507
0
28 Nov 2023
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation
Yufei Wang
Zhou Xian
Feng Chen
Tsun-Hsuan Wang
Yian Wang
Katerina Fragkiadaki
Zackory M. Erickson
David Held
Chuang Gan
LM&Ro
120
110
0
02 Nov 2023
RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
P. Sermanet
Tianli Ding
Jeffrey Zhao
Fei Xia
Debidatta Dwibedi
...
Pannag R Sanketi
Karol Hausman
Izhak Shafran
Brian Ichter
Yuan Cao
LM&Ro
114
54
0
01 Nov 2023
DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
Ge Zheng
Bin Yang
Jiajin Tang
Hong-Yu Zhou
Sibei Yang
LRM
MLLM
93
117
0
25 Oct 2023
Generative Skill Chaining: Long-Horizon Skill Planning with Diffusion Models
Utkarsh Aashu Mishra
Shangjie Xue
Yongxin Chen
Danfei Xu
LRM
90
73
0
13 Oct 2023
Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene Scale and Realism Tradeoffs for ObjectGoal Navigation
Mukul Khanna
Yongsen Mao
Hanxiao Jiang
Sanjay Haresh
Brennan Schacklett
Dhruv Batra
Alexander Clegg
Eric Undersander
Angel X. Chang
Manolis Savva
3DV
110
78
0
20 Jun 2023
Recognize Anything: A Strong Image Tagging Model
Youcai Zhang
Xinyu Huang
Jinyu Ma
Zhaoyang Li
Zhaochuan Luo
...
Tong Luo
Yaqian Li
Siyi Liu
Yandong Guo
Lei Zhang
VLM
142
242
0
06 Jun 2023
Visual Causal Scene Refinement for Video Question Answering
Yushen Wei
Yang Liu
Hongfei Yan
Guanbin Li
Liang Lin
CML
96
24
0
07 May 2023
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments
Dongyan An
Hongru Wang
Wenguan Wang
Zun Wang
Yan Huang
Keji He
Liang Wang
166
67
0
06 Apr 2023
EVA-CLIP: Improved Training Techniques for CLIP at Scale
Quan-Sen Sun
Yuxin Fang
Ledell Yu Wu
Xinlong Wang
Yue Cao
CLIP
VLM
151
513
0
27 Mar 2023
Objaverse: A Universe of Annotated 3D Objects
Matt Deitke
Dustin Schwenk
Jordi Salvador
Luca Weihs
Oscar Michel
Eli VanderBilt
Ludwig Schmidt
Kiana Ehsani
Aniruddha Kembhavi
Ali Farhadi
116
975
0
15 Dec 2022
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models
Chan Hee Song
Jiaman Wu
Clay Washington
Brian M Sadler
Wei-Lun Chao
Yu-Chuan Su
LLMAG
LM&Ro
170
424
0
08 Dec 2022
Habitat-Matterport 3D Semantics Dataset
Karmesh Yadav
Ram Ramrakhya
Santhosh Kumar Ramakrishnan
Theo Gervet
John Turner
...
Angel X. Chang
Dhruv Batra
Manolis Savva
Alexander Clegg
Devendra Singh Chaplot
3DV
MDE
170
92
0
11 Oct 2022
Iterative Vision-and-Language Navigation
Jacob Krantz
Shurjo Banerjee
Wang Zhu
Jason J. Corso
Peter Anderson
Stefan Lee
Jesse Thomason
LM&Ro
110
20
0
06 Oct 2022
Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
Yang Liu
Guanbin Li
Liang Lin
LRM
159
87
0
26 Jul 2022
Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions
Jing Gu
Eliana Stefani
Qi Wu
Jesse Thomason
Xinze Wang
LM&Ro
126
112
0
22 Mar 2022
Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation
Yicong Hong
Zun Wang
Qi Wu
Stephen Gould
3DV
98
66
0
05 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
954
9,784
0
28 Jan 2022
FILM: Following Instructions in Language with Modular Methods
So Yeon Min
Devendra Singh Chaplot
Pradeep Ravikumar
Yonatan Bisk
Ruslan Salakhutdinov
LM&Ro
279
163
0
12 Oct 2021
SOON: Scenario Oriented Object Navigation with Graph-based Exploration
Fengda Zhu
Xiwen Liang
Yi Zhu
Xiaojun Chang
Xiaodan Liang
66
127
0
31 Mar 2021
Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments
Jacob Krantz
Erik Wijmans
Arjun Majumdar
Dhruv Batra
Stefan Lee
88
280
0
06 Apr 2020
Vision-and-Dialog Navigation
Jesse Thomason
Michael Murray
Maya Cakmak
Luke Zettlemoyer
LM&Ro
102
331
0
10 Jul 2019
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation
Vihan Jain
Gabriel Ilharco
Alexander Ku
Ashish Vaswani
Eugene Ie
Jason Baldridge
LM&Ro
92
181
0
29 May 2019
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
Yuankai Qi
Qi Wu
Peter Anderson
Xinze Wang
Wenjie Wang
Chunhua Shen
Anton Van Den Hengel
LM&Ro
124
330
0
23 Apr 2019
Habitat: A Platform for Embodied AI Research
Manolis Savva
Abhishek Kadian
Oleksandr Maksymets
Yili Zhao
Erik Wijmans
...
Jia-Wei Liu
V. Koltun
Jitendra Malik
Devi Parikh
Dhruv Batra
LM&Ro
145
1,424
0
02 Apr 2019
Visual Memory for Robust Path Following
Ashish Kumar
Saurabh Gupta
David Fouhey
Sergey Levine
Jitendra Malik
61
48
0
03 Dec 2018
AI2-THOR: An Interactive 3D Environment for Visual AI
Eric Kolve
Roozbeh Mottaghi
Winson Han
Eli VanderBilt
Luca Weihs
...
Daniel Gordon
Yuke Zhu
Aniruddha Kembhavi
Abhinav Gupta
Ali Farhadi
LM&Ro
92
1,113
0
14 Dec 2017
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
LM&Ro
166
1,325
0
20 Nov 2017
1
2
Next