Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.07883
Cited By
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks
18 November 2019
Fengda Zhu
Yi Zhu
Xiaojun Chang
Xiaodan Liang
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks"
50 / 65 papers shown
Title
DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation
Yinfeng Yu
Dongsheng Yang
22
0
0
30 Apr 2025
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Junrong Yue
Yuhang Zhang
Chuan Qin
Jing Chen
Xiaomin Lie
Xinlei Yu
Wenxin Zhang
Zhendong Zhao
54
0
0
23 Apr 2025
HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard
Yifei Dong
Fengyi Wu
Qi He
Heng Li
Minghan Li
...
Yuxuan Zhou
Jingdong Sun
Qi Dai
Zhi-Qi Cheng
Alexander G. Hauptmann
LM&Ro
50
0
0
18 Mar 2025
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Xin Gu
Yaojie Shen
Chenxi Luo
Tiejian Luo
Yan Huang
Yuewei Lin
Heng Fan
L. Zhang
63
1
0
16 Feb 2025
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Yue Zhang
Ziqiao Ma
Jialu Li
Yanyuan Qiao
Zun Wang
J. Chai
Qi Wu
Joey Tianyi Zhou
Parisa Kordjamshidi
LRM
63
18
0
31 Dec 2024
iWalker: Imperative Visual Planning for Walking Humanoid Robot
Xiao Lin
Yuhao Huang
Taimeng Fu
Xiaobin Xiong
Chen Wang
36
0
0
27 Sep 2024
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Yanyuan Qiao
Wenqi Lyu
Hui Wang
Zixu Wang
Zerui Li
Yuan Zhang
Mingkui Tan
Qi Wu
LRM
36
3
0
27 Sep 2024
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
Bingqian Lin
Yunshuang Nie
Ziming Wei
Jiaqi Chen
Shikui Ma
Jianhua Han
Hang Xu
Xiaojun Chang
Xiaodan Liang
LM&Ro
LRM
62
20
0
12 Mar 2024
Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning
Bingqian Lin
Yanxin Long
Yi Zhu
Fengda Zhu
Xiaodan Liang
QiXiang Ye
Liang Lin
31
5
0
09 Mar 2024
Continual Referring Expression Comprehension via Dual Modular Memorization
Hengtao Shen
Cheng Chen
Peng Wang
Lianli Gao
Hao Wu
Jingkuan Song
ObjD
27
3
0
25 Nov 2023
Multi-Level Compositional Reasoning for Interactive Instruction Following
Suvaansh Bhambri
Byeonghwi Kim
Jonghyun Choi
LM&Ro
38
11
0
18 Aug 2023
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation
Hanqing Wang
Wei Liang
Luc Van Gool
Wenguan Wang
LM&Ro
33
28
0
14 Aug 2023
GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation
Jingyang Huo
Qiang Sun
Boyan Jiang
Haitao Lin
Yanwei Fu
36
19
0
26 May 2023
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Gengze Zhou
Yicong Hong
Qi Wu
ELM
LM&Ro
LLMAG
LRM
25
142
0
26 May 2023
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
Mingyu Ding
Yan Xu
Zhenfang Chen
David D. Cox
Ping Luo
J. Tenenbaum
Chuang Gan
LM&Ro
56
21
0
07 Apr 2023
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
Xiangyang Li
Zihan Wang
Jiahao Yang
Yaowei Wang
Shuqiang Jiang
LM&Ro
15
38
0
28 Mar 2023
Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding
Minyoung Hwang
Jaeyeon Jeong
Minsoo Kim
Yoonseon Oh
Songhwai Oh
22
19
0
07 Mar 2023
MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation
Zongtao He
Liuyi Wang
Shu Li
Qingqing Yan
Chengju Liu
Qi Chen
19
7
0
02 Mar 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
Bingqian Lin
Yi Zhu
Xiaodan Liang
Liang Lin
Jian-zhuo Liu
CoGe
LM&Ro
41
3
0
13 Feb 2023
Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Siyuan Huang
Zan Wang
Puhao Li
Baoxiong Jia
Tengyu Liu
Yixin Zhu
Wei Liang
Song-Chun Zhu
DiffM
64
201
0
15 Jan 2023
Graph based Environment Representation for Vision-and-Language Navigation in Continuous Environments
Ting Wang
Zongkai Wu
Feiyu Yao
Donglin Wang
51
5
0
11 Jan 2023
Prompter: Utilizing Large Language Model Prompting for a Data Efficient Embodied Instruction Following
Y. Inoue
Hiroki Ohashi
LM&Ro
30
43
0
07 Nov 2022
Bridging the visual gap in VLN via semantically richer instructions
Joaquín Ossandón
Benjamín Earle
Alvaro Soto
35
0
0
27 Oct 2022
Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation
Peihao Chen
Dongyu Ji
Kun-Li Channing Lin
Runhao Zeng
Thomas H. Li
Mingkui Tan
Chuang Gan
SSL
36
62
0
14 Oct 2022
Iterative Vision-and-Language Navigation
Jacob Krantz
Shurjo Banerjee
Wang Zhu
Jason J. Corso
Peter Anderson
Stefan Lee
Jesse Thomason
LM&Ro
40
18
0
06 Oct 2022
Anticipating the Unseen Discrepancy for Vision and Language Navigation
Yujie Lu
Huiliang Zhang
Ping Nie
Weixi Feng
Wenda Xu
Qing Guo
William Yang Wang
35
1
0
10 Sep 2022
Target-Driven Structured Transformer Planner for Vision-Language Navigation
Yusheng Zhao
Jinyu Chen
Chen Gao
Wenguan Wang
Lirong Yang
Haibing Ren
Huaxia Xia
Si Liu
LM&Ro
27
57
0
19 Jul 2022
Local Slot Attention for Vision-and-Language Navigation
Yifeng Zhuang
Qiang Sun
Yanwei Fu
Lifeng Chen
Xiangyang Xue
21
2
0
17 Jun 2022
FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
Zi-Yi Dou
Nanyun Peng
22
22
0
09 Jun 2022
Multi-View Transformer for 3D Visual Grounding
Shijia Huang
Yilun Chen
Jiaya Jia
Liwei Wang
25
112
0
05 Apr 2022
EnvEdit: Environment Editing for Vision-and-Language Navigation
Jialu Li
Hao Tan
Joey Tianyi Zhou
31
80
0
29 Mar 2022
elBERto: Self-supervised Commonsense Learning for Question Answering
Xunlin Zhan
Yuan Li
Xiao Dong
Xiaodan Liang
Zhiting Hu
Lawrence Carin
SSL
RALM
LRM
24
7
0
17 Mar 2022
Visual-Language Navigation Pretraining via Prompt-based Environmental Self-exploration
Xiwen Liang
Fengda Zhu
Lingling Li
Hang Xu
Xiaodan Liang
LM&Ro
VLM
30
29
0
08 Mar 2022
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
LM&Ro
28
139
0
23 Feb 2022
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
213
0
18 Feb 2022
Curriculum Learning for Vision-and-Language Navigation
Jiwen Zhang
Zhongyu Wei
Jianqing Fan
J. Peng
LM&Ro
26
21
0
14 Nov 2021
SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation
A. Moudgil
Arjun Majumdar
Harsh Agrawal
Stefan Lee
Dhruv Batra
LM&Ro
27
57
0
27 Oct 2021
ReaSCAN: Compositional Reasoning in Language Grounding
Zhengxuan Wu
Elisa Kreiss
Desmond C. Ong
Christopher Potts
CoGe
LRM
29
22
0
18 Sep 2021
Procedures as Programs: Hierarchical Control of Situated Agents through Natural Language
Shuyan Zhou
Pengcheng Yin
Graham Neubig
LM&Ro
14
1
0
16 Sep 2021
SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments
Muhammad Zubair Irshad
Niluthpol Chowdhury Mithun
Zachary Seymour
Han-Pang Chiu
S. Samarasekera
Rakesh Kumar
LM&Ro
26
49
0
26 Aug 2021
Communicative Learning with Natural Gestures for Embodied Navigation Agents with Human-in-the-Scene
Qi Wu
Cheng-Ju Wu
Yixin Zhu
Jungseock Joo
43
14
0
05 Aug 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Joey Tianyi Zhou
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
196
405
0
13 Jul 2021
Core Challenges in Embodied Vision-Language Planning
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
47
45
0
26 Jun 2021
Vision-Language Navigation with Random Environmental Mixup
Chong Liu
Fengda Zhu
Xiaojun Chang
Xiaodan Liang
Zongyuan Ge
Yi-Dong Shen
LM&Ro
56
86
0
15 Jun 2021
Episodic Transformer for Vision-and-Language Navigation
Alexander Pashevich
Cordelia Schmid
Chen Sun
LM&Ro
43
193
0
13 May 2021
Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation
Muhammad Zubair Irshad
Chih-Yao Ma
Z. Kira
LM&Ro
27
49
0
21 Apr 2021
The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation
Yuankai Qi
Zizheng Pan
Yicong Hong
Ming-Hsuan Yang
Anton Van Den Hengel
Qi Wu
LM&Ro
24
68
0
09 Apr 2021
SOON: Scenario Oriented Object Navigation with Graph-based Exploration
Fengda Zhu
Xiwen Liang
Yi Zhu
Xiaojun Chang
Xiaodan Liang
24
122
0
31 Mar 2021
Diagnosing Vision-and-Language Navigation: What Really Matters
Wanrong Zhu
Yuankai Qi
P. Narayana
Kazoo Sone
Sugato Basu
Qing Guo
Qi Wu
M. Eckstein
Luu Anh Tuan
LM&Ro
27
50
0
30 Mar 2021
Relation-aware Instance Refinement for Weakly Supervised Visual Grounding
Yongfei Liu
Bo Wan
Lin Ma
Xuming He
ObjD
16
55
0
24 Mar 2021
1
2
Next