ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.10092
  4. Cited By
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning
  for Vision-Language Navigation
v1v2 (latest)

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

25 November 2018
Xin Eric Wang
Qiuyuan Huang
Asli Celikyilmaz
Jianfeng Gao
Dinghan Shen
Yuan-fang Wang
William Yang Wang
Lei Zhang
    LM&RoSSL
ArXiv (abs)PDFHTML

Papers citing "Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation"

50 / 63 papers shown
Title
VISTA: Generative Visual Imagination for Vision-and-Language Navigation
VISTA: Generative Visual Imagination for Vision-and-Language Navigation
Yanjia Huang
Mingyang Wu
Renjie Li
Zhengzhong Tu
LM&Ro
105
0
0
09 May 2025
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Junrong Yue
Yanzhe Zhang
Chuan Qin
Jing Chen
Xiaomin Lie
Xinlei Yu
Wenxin Zhang
Zhendong Zhao
121
1
0
23 Apr 2025
UAS Visual Navigation in Large and Unseen Environments via a Meta Agent
UAS Visual Navigation in Large and Unseen Environments via a Meta Agent
Yuci Han
Charles Toth
Alper Yilmaz
95
0
0
20 Mar 2025
HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard
HA-VLN: A Benchmark for Human-Aware Navigation in Discrete-Continuous Environments with Dynamic Multi-Human Interactions, Real-World Validation, and an Open Leaderboard
Yifei Dong
Fengyi Wu
Qi He
Heng Li
Minghan Li
...
Yuxuan Zhou
Jingdong Sun
Qi Dai
Zhi-Qi Cheng
Alexander G. Hauptmann
LM&Ro
81
0
0
18 Mar 2025
TRAVEL: Training-Free Retrieval and Alignment for Vision-and-Language Navigation
TRAVEL: Training-Free Retrieval and Alignment for Vision-and-Language Navigation
Navid Rajabi
Jana Kosecka
LM&Ro3DV
123
0
0
11 Feb 2025
Evaluating Vision-Language Models as Evaluators in Path Planning
Evaluating Vision-Language Models as Evaluators in Path Planning
Mohamed Aghzal
Xiang Yue
Erion Plaku
Ziyu Yao
LRM
167
1
0
27 Nov 2024
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Yanyuan Qiao
Wenqi Lyu
Hui Wang
Zixu Wang
Zerui Li
Yuan Zhang
Mingkui Tan
Qi Wu
LRM
81
6
0
27 Sep 2024
Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation
Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation
Weize Li
Zhicheng Zhao
Haochen Bai
Fei Su
98
0
0
24 May 2024
Mind the Error! Detection and Localization of Instruction Errors in Vision-and-Language Navigation
Mind the Error! Detection and Localization of Instruction Errors in Vision-and-Language Navigation
Francesco Taioli
Stefano Rosa
A. Castellini
Lorenzo Natale
Alessio Del Bue
Alessandro Farinelli
Marco Cristani
Yiming Wang
93
5
0
15 Mar 2024
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
Bingqian Lin
Yunshuang Nie
Ziming Wei
Jiaqi Chen
Shikui Ma
Jianhua Han
Hang Xu
Xiaojun Chang
Xiaodan Liang
LM&RoLRM
120
27
0
12 Mar 2024
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
All in One: Exploring Unified Vision-Language Tracking with Multi-Modal Alignment
Chunhui Zhang
Xin Sun
Li Liu
Yiqian Yang
Qiong Liu
Xiaoping Zhou
Yanfeng Wang
192
16
0
07 Jul 2023
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling
Tsu-Jui Fu
Xinze Wang
Matthew F. Peterson
Scott T. Grafton
Miguel P. Eckstein
William Yang Wang
103
43
0
17 Nov 2019
Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language
  Navigation
Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation
Liyiming Ke
Xiujun Li
Yonatan Bisk
Ari Holtzman
Zhe Gan
Jingjing Liu
Jianfeng Gao
Yejin Choi
S. Srinivasa
96
168
0
06 Mar 2019
The Regretful Agent: Heuristic-Aided Navigation through Progress
  Estimation
The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation
Chih-Yao Ma
Zuxuan Wu
G. Al-Regib
Caiming Xiong
Z. Kira
LM&Ro
85
174
0
05 Mar 2019
Self-Monitoring Navigation Agent via Auxiliary Progress Estimation
Self-Monitoring Navigation Agent via Auxiliary Progress Estimation
Chih-Yao Ma
Jiasen Lu
Zuxuan Wu
G. Al-Regib
Z. Kira
R. Socher
Caiming Xiong
LM&Ro
92
277
0
10 Jan 2019
Vision-based Navigation with Language-based Assistance via Imitation
  Learning with Indirect Intervention
Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention
Khanh Nguyen
Debadeepta Dey
Chris Brockett
W. Dolan
LM&Ro
76
131
0
10 Dec 2018
MAN: Moment Alignment Network for Natural Language Moment Retrieval via
  Iterative Graph Adjustment
MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment
Da Zhang
Xiyang Dai
Xin Eric Wang
Yuan-fang Wang
L. Davis
78
305
0
30 Nov 2018
Shifting the Baseline: Single Modality Performance on Visual Navigation
  & QA
Shifting the Baseline: Single Modality Performance on Visual Navigation & QA
Jesse Thomason
Daniel Gordon
Yonatan Bisk
85
75
0
01 Nov 2018
Neural Approaches to Conversational AI
Neural Approaches to Conversational AI
Jianfeng Gao
Michel Galley
Lihong Li
109
676
0
21 Sep 2018
Gibson Env: Real-World Perception for Embodied Agents
Gibson Env: Real-World Perception for Embodied Agents
F. Xia
Amir Zamir
Zhi-Yang He
Alexander Sax
Jitendra Malik
Silvio Savarese
AI4CELM&Ro
79
828
0
31 Aug 2018
On Evaluation of Embodied Navigation Agents
On Evaluation of Embodied Navigation Agents
Peter Anderson
Angel X. Chang
Devendra Singh Chaplot
Alexey Dosovitskiy
Saurabh Gupta
...
Jana Kosecka
Jitendra Malik
Roozbeh Mottaghi
Manolis Savva
Amir Zamir
117
802
0
18 Jul 2018
Self-Imitation Learning
Self-Imitation Learning
Junhyuk Oh
Yijie Guo
Satinder Singh
Honglak Lee
SSL
65
251
0
14 Jun 2018
Speaker-Follower Models for Vision-and-Language Navigation
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&RoLRM
319
504
0
07 Jun 2018
Turbo Learning for Captionbot and Drawingbot
Turbo Learning for Captionbot and Drawingbot
Qiuyuan Huang
Pengchuan Zhang
D. Wu
Lei Zhang
43
25
0
21 May 2018
Visual Representations for Semantic Target Driven Navigation
Visual Representations for Semantic Target Driven Navigation
Arsalan Mousavian
Alexander Toshev
Marek Fiser
Jana Kosecka
Ayzaan Wahid
James Davidson
70
202
0
15 May 2018
No Metrics Are Perfect: Adversarial Reward Learning for Visual
  Storytelling
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
Xin Eric Wang
Wenhu Chen
Yuan-fang Wang
William Yang Wang
59
159
0
24 Apr 2018
Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal
  Attentions for Video Captioning
Watch, Listen, and Describe: Globally and Locally Aligned Cross-Modal Attentions for Video Captioning
Xinze Wang
Yuan-fang Wang
William Yang Wang
58
76
0
15 Apr 2018
Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement
  Learning for Planned-Ahead Vision-and-Language Navigation
Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation
Xin Eric Wang
Wenhan Xiong
Hongmin Wang
William Yang Wang
78
201
0
21 Mar 2018
Deep contextualized word representations
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
233
11,566
0
15 Feb 2018
AI2-THOR: An Interactive 3D Environment for Visual AI
AI2-THOR: An Interactive 3D Environment for Visual AI
Eric Kolve
Roozbeh Mottaghi
Winson Han
Eli VanderBilt
Luca Weihs
...
Daniel Gordon
Yuke Zhu
Aniruddha Kembhavi
Abhinav Gupta
Ali Farhadi
LM&Ro
84
1,110
0
14 Dec 2017
MINOS: Multimodal Indoor Simulator for Navigation in Complex
  Environments
MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments
Manolis Savva
Angel X. Chang
Alexey Dosovitskiy
Thomas Funkhouser
V. Koltun
87
247
0
11 Dec 2017
Embodied Question Answering
Embodied Question Answering
Abhishek Das
Samyak Datta
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
100
651
0
30 Nov 2017
Video Captioning via Hierarchical Reinforcement Learning
Video Captioning via Hierarchical Reinforcement Learning
Xin Eric Wang
Wenhu Chen
Jiawei Wu
Yuan-fang Wang
William Yang Wang
88
229
0
29 Nov 2017
Vision-and-Language Navigation: Interpreting visually-grounded
  navigation instructions in real environments
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
LM&Ro
106
1,322
0
20 Nov 2017
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems
Zachary Chase Lipton
Xiujun Li
Jianfeng Gao
Lihong Li
Faisal Ahmed
Li Deng
80
172
0
15 Nov 2017
Matterport3D: Learning from RGB-D Data in Indoor Environments
Matterport3D: Learning from RGB-D Data in Indoor Environments
Angel X. Chang
Angela Dai
Thomas Funkhouser
Maciej Halber
Matthias Nießner
Manolis Savva
Shuran Song
Andy Zeng
Yinda Zhang
3DV3DPC
205
1,916
0
18 Sep 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
123
4,221
0
25 Jul 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
786
132,363
0
12 Jun 2017
Curiosity-driven Exploration by Self-supervised Prediction
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRMSSL
122
2,451
0
15 May 2017
Count-Based Exploration with Neural Density Models
Count-Based Exploration with Neural Density Models
Georg Ostrovski
Marc G. Bellemare
Aaron van den Oord
Rémi Munos
86
625
0
03 Mar 2017
Semantic Scene Completion from a Single Depth Image
Semantic Scene Completion from a Single Depth Image
Shuran Song
Feng Yu
Andy Zeng
Angel X. Chang
Manolis Savva
Thomas Funkhouser
3DV
90
1,245
0
28 Nov 2016
Visual Dialog
Visual Dialog
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
155
1,002
0
26 Nov 2016
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement
  Learning
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
Haoran Tang
Rein Houthooft
Davis Foote
Adam Stooke
Xi Chen
Yan Duan
John Schulman
F. Turck
Pieter Abbeel
OffRL
108
775
0
15 Nov 2016
Learning to Navigate in Complex Environments
Learning to Navigate in Complex Environments
Piotr Wojciech Mirowski
Razvan Pascanu
Fabio Viola
Hubert Soyer
Andy Ballard
...
Ross Goroshin
Laurent Sifre
Koray Kavukcuoglu
D. Kumaran
R. Hadsell
107
880
0
11 Nov 2016
Target-driven Visual Navigation in Indoor Scenes using Deep
  Reinforcement Learning
Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning
Yuke Zhu
Roozbeh Mottaghi
Eric Kolve
Joseph J. Lim
Abhinav Gupta
Li Fei-Fei
Ali Farhadi
VGen
73
1,527
0
16 Sep 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
131
1,275
0
31 Jul 2016
Unifying Count-Based Exploration and Intrinsic Motivation
Unifying Count-Based Exploration and Intrinsic Motivation
Marc G. Bellemare
S. Srinivasan
Georg Ostrovski
Tom Schaul
D. Saxton
Rémi Munos
179
1,483
0
06 Jun 2016
Segmentation from Natural Language Expressions
Segmentation from Natural Language Expressions
Ronghang Hu
Marcus Rohrbach
Trevor Darrell
VLMEgoV
76
437
0
20 Mar 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,426
0
10 Dec 2015
MovieQA: Understanding Stories in Movies through Question-Answering
MovieQA: Understanding Stories in Movies through Question-Answering
Makarand Tapaswi
Yukun Zhu
Rainer Stiefelhagen
Antonio Torralba
R. Urtasun
Sanja Fidler
120
752
0
09 Dec 2015
12
Next