Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.03561
Cited By
v1
v2 (latest)
VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation
5 February 2024
Jialu Li
Aishwarya Padmakumar
Gaurav Sukhatme
Mohit Bansal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation"
28 / 28 papers shown
Title
A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cues
Jason Armitage
L. Impett
Rico Sennrich
84
5
0
24 Jul 2022
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
126
298
0
23 Jun 2022
ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts
Bingqian Lin
Yi Zhu
Zicong Chen
Xiwen Liang
Jian-zhuo Liu
Xiaodan Liang
LM&Ro
89
51
0
31 May 2022
EnvEdit: Environment Editing for Vision-and-Language Navigation
Jialu Li
Hao Tan
Joey Tianyi Zhou
99
81
0
29 Mar 2022
HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation
Yanyuan Qiao
Yuankai Qi
Yicong Hong
Zheng Yu
Peifeng Wang
Qi Wu
AI4TS
92
74
0
22 Mar 2022
Cross-modal Map Learning for Vision and Language Navigation
G. Georgakis
Karl Schmeckpeper
Karan Wanchoo
Soham Dan
E. Miltsakaki
Dan Roth
Kostas Daniilidis
74
64
0
10 Mar 2022
Airbert: In-domain Pretraining for Vision-and-Language Navigation
Pierre-Louis Guhur
Makarand Tapaswi
Shizhe Chen
Ivan Laptev
Cordelia Schmid
LM&Ro
49
139
0
20 Aug 2021
Vision-Language Navigation with Random Environmental Mixup
Chong Liu
Fengda Zhu
Xiaojun Chang
Xiaodan Liang
Zongyuan Ge
Yi-Dong Shen
LM&Ro
92
86
0
15 Jun 2021
On the Evaluation of Vision-and-Language Navigation Instructions
Mingde Zhao
Peter Anderson
Vihan Jain
Su Wang
Alexander Ku
Jason Baldridge
Eugene Ie
284
53
0
26 Jan 2021
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Alexander Ku
Peter Anderson
Roma Patel
Eugene Ie
Jason Baldridge
93
308
0
15 Oct 2020
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
Wanrong Zhu
Xinze Wang
Tsu-Jui Fu
An Yan
P. Narayana
Kazoo Sone
Sugato Basu
Wenjie Wang
78
33
0
01 Jul 2020
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Arjun Majumdar
Ayush Shrivastava
Stefan Lee
Peter Anderson
Devi Parikh
Dhruv Batra
LM&Ro
145
233
0
30 Apr 2020
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
Weituo Hao
Chunyuan Li
Xiujun Li
Lawrence Carin
Jianfeng Gao
LM&Ro
75
279
0
25 Feb 2020
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
Mohit Shridhar
Jesse Thomason
Daniel Gordon
Yonatan Bisk
Winson Han
Roozbeh Mottaghi
Luke Zettlemoyer
Dieter Fox
LM&Ro
109
770
0
03 Dec 2019
Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning
Khanh Nguyen
Hal Daumé
LM&Ro
EgoV
206
151
0
04 Sep 2019
LVIS: A Dataset for Large Vocabulary Instance Segmentation
Agrim Gupta
Piotr Dollár
Ross B. Girshick
ISeg
VLM
103
1,371
0
08 Aug 2019
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
Yuankai Qi
Qi Wu
Peter Anderson
Xinze Wang
Wenjie Wang
Chunhua Shen
Anton Van Den Hengel
LM&Ro
93
324
0
23 Apr 2019
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
Hao Tan
Licheng Yu
Joey Tianyi Zhou
SSL
88
318
0
08 Apr 2019
Learning To Follow Directions in Street View
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
77
68
0
01 Mar 2019
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
80
387
0
29 Nov 2018
Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction
Dipendra Kumar Misra
Andrew Bennett
Valts Blukis
Eyvind Niklasson
Max Shatkhin
Yoav Artzi
LM&Ro
77
188
0
04 Sep 2018
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&Ro
LRM
317
501
0
07 Jun 2018
Learning to Navigate in Cities Without a Map
Piotr Wojciech Mirowski
Matthew Koichi Grimes
Mateusz Malinowski
Karl Moritz Hermann
Keith Anderson
Denis Teplyashin
Karen Simonyan
Koray Kavukcuoglu
Andrew Zisserman
R. Hadsell
SSL
HAI
95
319
0
31 Mar 2018
Matterport3D: Learning from RGB-D Data in Indoor Environments
Angel X. Chang
Angela Dai
Thomas Funkhouser
Maciej Halber
Matthias Nießner
Manolis Savva
Shuran Song
Andy Zeng
Yinda Zhang
3DV
3DPC
189
1,901
0
18 Sep 2017
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
139
1,248
0
02 May 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
Y. Jang
Yale Song
Youngjae Yu
Youngjin Kim
Gunhee Kim
75
558
0
14 Apr 2017
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
352
27,195
0
20 Mar 2017
A Dataset for Movie Description
Anna Rohrbach
Marcus Rohrbach
Niket Tandon
Bernt Schiele
VGen
119
501
0
12 Jan 2015
1