ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.03561
  4. Cited By
VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language
  Navigation
v1v2 (latest)

VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation

5 February 2024
Jialu Li
Aishwarya Padmakumar
Gaurav Sukhatme
Mohit Bansal
ArXiv (abs)PDFHTML

Papers citing "VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation"

28 / 28 papers shown
Title
A Priority Map for Vision-and-Language Navigation with Trajectory Plans
  and Feature-Location Cues
A Priority Map for Vision-and-Language Navigation with Trajectory Plans and Feature-Location Cues
Jason Armitage
L. Impett
Rico Sennrich
84
5
0
24 Jul 2022
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online
  Videos
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
126
298
0
23 Jun 2022
ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts
ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts
Bingqian Lin
Yi Zhu
Zicong Chen
Xiwen Liang
Jian-zhuo Liu
Xiaodan Liang
LM&Ro
89
51
0
31 May 2022
EnvEdit: Environment Editing for Vision-and-Language Navigation
EnvEdit: Environment Editing for Vision-and-Language Navigation
Jialu Li
Hao Tan
Joey Tianyi Zhou
99
81
0
29 Mar 2022
HOP: History-and-Order Aware Pre-training for Vision-and-Language
  Navigation
HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation
Yanyuan Qiao
Yuankai Qi
Yicong Hong
Zheng Yu
Peifeng Wang
Qi Wu
AI4TS
92
74
0
22 Mar 2022
Cross-modal Map Learning for Vision and Language Navigation
Cross-modal Map Learning for Vision and Language Navigation
G. Georgakis
Karl Schmeckpeper
Karan Wanchoo
Soham Dan
E. Miltsakaki
Dan Roth
Kostas Daniilidis
74
64
0
10 Mar 2022
Airbert: In-domain Pretraining for Vision-and-Language Navigation
Airbert: In-domain Pretraining for Vision-and-Language Navigation
Pierre-Louis Guhur
Makarand Tapaswi
Shizhe Chen
Ivan Laptev
Cordelia Schmid
LM&Ro
49
139
0
20 Aug 2021
Vision-Language Navigation with Random Environmental Mixup
Vision-Language Navigation with Random Environmental Mixup
Chong Liu
Fengda Zhu
Xiaojun Chang
Xiaodan Liang
Zongyuan Ge
Yi-Dong Shen
LM&Ro
92
86
0
15 Jun 2021
On the Evaluation of Vision-and-Language Navigation Instructions
On the Evaluation of Vision-and-Language Navigation Instructions
Mingde Zhao
Peter Anderson
Vihan Jain
Su Wang
Alexander Ku
Jason Baldridge
Eugene Ie
284
53
0
26 Jan 2021
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense
  Spatiotemporal Grounding
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Alexander Ku
Peter Anderson
Roma Patel
Eugene Ie
Jason Baldridge
93
308
0
15 Oct 2020
Multimodal Text Style Transfer for Outdoor Vision-and-Language
  Navigation
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
Wanrong Zhu
Xinze Wang
Tsu-Jui Fu
An Yan
P. Narayana
Kazoo Sone
Sugato Basu
Wenjie Wang
78
33
0
01 Jul 2020
Improving Vision-and-Language Navigation with Image-Text Pairs from the
  Web
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Arjun Majumdar
Ayush Shrivastava
Stefan Lee
Peter Anderson
Devi Parikh
Dhruv Batra
LM&Ro
145
233
0
30 Apr 2020
Towards Learning a Generic Agent for Vision-and-Language Navigation via
  Pre-training
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
Weituo Hao
Chunyuan Li
Xiujun Li
Lawrence Carin
Jianfeng Gao
LM&Ro
75
279
0
25 Feb 2020
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday
  Tasks
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
Mohit Shridhar
Jesse Thomason
Daniel Gordon
Yonatan Bisk
Winson Han
Roozbeh Mottaghi
Luke Zettlemoyer
Dieter Fox
LM&Ro
109
770
0
03 Dec 2019
Help, Anna! Visual Navigation with Natural Multimodal Assistance via
  Retrospective Curiosity-Encouraging Imitation Learning
Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning
Khanh Nguyen
Hal Daumé
LM&RoEgoV
206
151
0
04 Sep 2019
LVIS: A Dataset for Large Vocabulary Instance Segmentation
LVIS: A Dataset for Large Vocabulary Instance Segmentation
Agrim Gupta
Piotr Dollár
Ross B. Girshick
ISegVLM
103
1,371
0
08 Aug 2019
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor
  Environments
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
Yuankai Qi
Qi Wu
Peter Anderson
Xinze Wang
Wenjie Wang
Chunhua Shen
Anton Van Den Hengel
LM&Ro
93
324
0
23 Apr 2019
Learning to Navigate Unseen Environments: Back Translation with
  Environmental Dropout
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
Hao Tan
Licheng Yu
Joey Tianyi Zhou
SSL
88
318
0
08 Apr 2019
Learning To Follow Directions in Street View
Learning To Follow Directions in Street View
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
77
68
0
01 Mar 2019
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual
  Street Environments
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
80
387
0
29 Nov 2018
Mapping Instructions to Actions in 3D Environments with Visual Goal
  Prediction
Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction
Dipendra Kumar Misra
Andrew Bennett
Valts Blukis
Eyvind Niklasson
Max Shatkhin
Yoav Artzi
LM&Ro
77
188
0
04 Sep 2018
Speaker-Follower Models for Vision-and-Language Navigation
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&RoLRM
317
501
0
07 Jun 2018
Learning to Navigate in Cities Without a Map
Learning to Navigate in Cities Without a Map
Piotr Wojciech Mirowski
Matthew Koichi Grimes
Mateusz Malinowski
Karl Moritz Hermann
Keith Anderson
Denis Teplyashin
Karen Simonyan
Koray Kavukcuoglu
Andrew Zisserman
R. Hadsell
SSLHAI
95
319
0
31 Mar 2018
Matterport3D: Learning from RGB-D Data in Indoor Environments
Matterport3D: Learning from RGB-D Data in Indoor Environments
Angel X. Chang
Angela Dai
Thomas Funkhouser
Maciej Halber
Matthias Nießner
Manolis Savva
Shuran Song
Andy Zeng
Yinda Zhang
3DV3DPC
189
1,901
0
18 Sep 2017
Dense-Captioning Events in Videos
Dense-Captioning Events in Videos
Ranjay Krishna
Kenji Hata
F. Ren
Li Fei-Fei
Juan Carlos Niebles
139
1,248
0
02 May 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering
Y. Jang
Yale Song
Youngjae Yu
Youngjin Kim
Gunhee Kim
75
558
0
14 Apr 2017
Mask R-CNN
Mask R-CNN
Kaiming He
Georgia Gkioxari
Piotr Dollár
Ross B. Girshick
ObjD
352
27,195
0
20 Mar 2017
A Dataset for Movie Description
A Dataset for Movie Description
Anna Rohrbach
Marcus Rohrbach
Niket Tandon
Bernt Schiele
VGen
119
501
0
12 Jan 2015
1