ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.07280
  4. Cited By
Vision-and-Language Navigation: Interpreting visually-grounded
  navigation instructions in real environments

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

20 November 2017
Peter Anderson
Qi Wu
Damien Teney
Jake Bruce
Mark Johnson
Niko Sünderhauf
Ian Reid
Stephen Gould
Anton Van Den Hengel
    LM&Ro
ArXivPDFHTML

Papers citing "Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments"

50 / 307 papers shown
Title
Simple and Effective Synthesis of Indoor 3D Scenes
Simple and Effective Synthesis of Indoor 3D Scenes
Jing Yu Koh
Harsh Agrawal
Dhruv Batra
Richard Tucker
Austin Waters
Honglak Lee
Yinfei Yang
Jason Baldridge
Peter Anderson
VGen
3DV
24
29
0
06 Apr 2022
Inferring Rewards from Language in Context
Inferring Rewards from Language in Context
Jessy Lin
Daniel Fried
Dan Klein
Anca Dragan
LM&Ro
27
54
0
05 Apr 2022
Moment-based Adversarial Training for Embodied Language Comprehension
Moment-based Adversarial Training for Embodied Language Comprehension
Shintaro Ishikawa
K. Sugiura
LM&Ro
46
8
0
02 Apr 2022
Continuous Scene Representations for Embodied AI
Continuous Scene Representations for Embodied AI
S. Gadre
Kiana Ehsani
Shuran Song
Roozbeh Mottaghi
33
46
0
31 Mar 2022
EnvEdit: Environment Editing for Vision-and-Language Navigation
EnvEdit: Environment Editing for Vision-and-Language Navigation
Jialu Li
Hao Tan
Joey Tianyi Zhou
31
80
0
29 Mar 2022
Shifting More Attention to Visual Backbone: Query-modulated Refinement
  Networks for End-to-End Visual Grounding
Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding
Jiabo Ye
Junfeng Tian
Ming Yan
Xiaoshan Yang
Xuwu Wang
Ji Zhang
Liang He
Xin Lin
ObjD
11
61
0
29 Mar 2022
Text2Pos: Text-to-Point-Cloud Cross-Modal Localization
Text2Pos: Text-to-Point-Cloud Cross-Modal Localization
Manuel Kolmet
Qunjie Zhou
Aljosa Osep
Laura Leal-Taixe
21
22
0
28 Mar 2022
FedVLN: Privacy-preserving Federated Vision-and-Language Navigation
FedVLN: Privacy-preserving Federated Vision-and-Language Navigation
Kaiwen Zhou
Qing Guo
FedML
26
8
0
28 Mar 2022
Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Single-Stream Multi-Level Alignment for Vision-Language Pretraining
Zaid Khan
B. Vijaykumar
Xiang Yu
S. Schulter
Manmohan Chandraker
Y. Fu
CLIP
VLM
25
16
0
27 Mar 2022
Reshaping Robot Trajectories Using Natural Language Commands: A Study of
  Multi-Modal Data Alignment Using Transformers
Reshaping Robot Trajectories Using Natural Language Commands: A Study of Multi-Modal Data Alignment Using Transformers
A. Bucker
Luis F. C. Figueredo
Sami Haddadin
Ashish Kapoor
Shuang Ma
Rogerio Bonatti
LM&Ro
35
49
0
25 Mar 2022
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot
  Object Navigation
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
S. Gadre
Mitchell Wortsman
Gabriel Ilharco
Ludwig Schmidt
Shuran Song
CLIP
LM&Ro
35
142
0
20 Mar 2022
Object Manipulation via Visual Target Localization
Object Manipulation via Visual Target Localization
Kiana Ehsani
Ali Farhadi
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
26
9
0
15 Mar 2022
Summarizing a virtual robot's past actions in natural language
Summarizing a virtual robot's past actions in natural language
Chad DeChant
Daniel Bauer
LM&Ro
31
4
0
13 Mar 2022
Cross-modal Map Learning for Vision and Language Navigation
Cross-modal Map Learning for Vision and Language Navigation
G. Georgakis
Karl Schmeckpeper
Karan Wanchoo
Soham Dan
E. Miltsakaki
Dan Roth
Kostas Daniilidis
22
64
0
10 Mar 2022
One-Shot Learning from a Demonstration with Hierarchical Latent Language
One-Shot Learning from a Demonstration with Hierarchical Latent Language
Nathaniel Weir
Xingdi Yuan
Marc-Alexandre Côté
Matthew J. Hausknecht
Romain Laroche
Ida Momennejad
H. V. Seijen
Benjamin Van Durme
BDL
19
6
0
09 Mar 2022
Visual-Language Navigation Pretraining via Prompt-based Environmental
  Self-exploration
Visual-Language Navigation Pretraining via Prompt-based Environmental Self-exploration
Xiwen Liang
Fengda Zhu
Lingling Li
Hang Xu
Xiaodan Liang
LM&Ro
VLM
30
29
0
08 Mar 2022
Modeling Coreference Relations in Visual Dialog
Modeling Coreference Relations in Visual Dialog
Mingxiao Li
Marie-Francine Moens
13
9
0
06 Mar 2022
Bridging the Gap Between Learning in Discrete and Continuous
  Environments for Vision-and-Language Navigation
Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation
Yicong Hong
Zun Wang
Qi Wu
Stephen Gould
3DV
29
64
0
05 Mar 2022
Online Learning of Reusable Abstract Models for Object Goal Navigation
Online Learning of Reusable Abstract Models for Object Goal Navigation
Tommaso Campari
Leonardo Lamanna
P. Traverso
Luciano Serafini
Lamberto Ballan
EgoV
15
19
0
04 Mar 2022
DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following
DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following
Xiaofeng Gao
Qiaozi Gao
Ran Gong
Kaixiang Lin
Govind Thattai
Gaurav Sukhatme
LM&Ro
89
70
0
27 Feb 2022
Think Global, Act Local: Dual-scale Graph Transformer for
  Vision-and-Language Navigation
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
LM&Ro
28
139
0
23 Feb 2022
Image-based Navigation in Real-World Environments via Multiple Mid-level
  Representations: Fusion Models, Benchmark and Efficient Evaluation
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation
Marco Rosano
Antonino Furnari
Luigi Gulino
C. Santoro
G. Farinella
EgoV
47
5
0
02 Feb 2022
Learning to Act with Affordance-Aware Multimodal Neural SLAM
Learning to Act with Affordance-Aware Multimodal Neural SLAM
Zhiwei Jia
Kaixiang Lin
Yizhou Zhao
Qiaozi Gao
Govind Thattai
Gaurav Sukhatme
LM&Ro
31
15
0
24 Jan 2022
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge
  for Embodied Agents
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
Wenlong Huang
Pieter Abbeel
Deepak Pathak
Igor Mordatch
LM&Ro
42
1,056
0
18 Jan 2022
3D Question Answering
3D Question Answering
Shuquan Ye
Dongdong Chen
Songfang Han
Jing Liao
ViT
26
46
0
15 Dec 2021
Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms
Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms
Kai Wang
Xianghao Xu
Leon Lei
Selena Ling
Natalie Lindsay
Angel X. Chang
Manolis Savva
Daniel E. Ritchie
3DV
19
5
0
10 Dec 2021
Creating Multimodal Interactive Agents with Imitation and
  Self-Supervised Learning
Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning
DeepMind Interactive Agents Team Josh Abramson
Josh Abramson
Arun Ahuja
Arthur Brussee
Federico Carnevale
...
Tamara von Glehn
Greg Wayne
Nathaniel Wong
Chen Yan
Rui Zhu
LM&Ro
40
46
0
07 Dec 2021
MDFM: Multi-Decision Fusing Model for Few-Shot Learning
MDFM: Multi-Decision Fusing Model for Few-Shot Learning
Shuai Shao
Lei Xing
Rui Xu
Weifeng Liu
Yanjiang Wang
Baodi Liu
38
30
0
01 Dec 2021
An in-depth experimental study of sensor usage and visual reasoning of
  robots navigating in real environments
An in-depth experimental study of sensor usage and visual reasoning of robots navigating in real environments
Assem Sadek
G. Bono
Boris Chidlovskii
Christian Wolf
20
10
0
29 Nov 2021
Agent-Centric Relation Graph for Object Visual Navigation
Agent-Centric Relation Graph for Object Visual Navigation
X. Hu
Youfang Lin
Shuo Wang
Zhihao Wu
Kai Lv
36
19
0
29 Nov 2021
Less is More: Generating Grounded Navigation Instructions from Landmarks
Less is More: Generating Grounded Navigation Instructions from Landmarks
Su Wang
Ceslee Montgomery
Jordi Orbay
Vighnesh Birodkar
Aleksandra Faust
Izzeddin Gur
Natasha Jaques
Austin Waters
Jason Baldridge
Peter Anderson
20
63
0
25 Nov 2021
Curriculum Learning for Vision-and-Language Navigation
Curriculum Learning for Vision-and-Language Navigation
Jiwen Zhang
Zhongyu Wei
Jianqing Fan
J. Peng
LM&Ro
26
21
0
14 Nov 2021
LILA: Language-Informed Latent Actions
LILA: Language-Informed Latent Actions
Siddharth Karamcheti
Megha Srivastava
Percy Liang
Dorsa Sadigh
LM&Ro
30
31
0
05 Nov 2021
SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language
  Navigation
SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation
A. Moudgil
Arjun Majumdar
Harsh Agrawal
Stefan Lee
Dhruv Batra
LM&Ro
27
57
0
27 Oct 2021
Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning
Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning
Kibeom Kim
Min Whoo Lee
Yoonsung Kim
Je-hwan Ryu
Minsu Lee
Byoung-Tak Zhang
24
8
0
25 Oct 2021
No RL, No Simulation: Learning to Navigate without Navigating
No RL, No Simulation: Learning to Navigate without Navigating
Meera Hahn
Devendra Singh Chaplot
Shubham Tulsiani
Mustafa Mukadam
James M. Rehg
Abhinav Gupta
75
71
0
18 Oct 2021
A Framework for Learning to Request Rich and Contextually Useful
  Information from Humans
A Framework for Learning to Request Rich and Contextually Useful Information from Humans
Khanh Nguyen
Yonatan Bisk
Hal Daumé
47
16
0
14 Oct 2021
Feudal Reinforcement Learning by Reading Manuals
Feudal Reinforcement Learning by Reading Manuals
Kai Wang
Zhonghao Wang
Mo Yu
Humphrey Shi
OffRL
35
0
0
13 Oct 2021
Are you doing what I say? On modalities alignment in ALFRED
Are you doing what I say? On modalities alignment in ALFRED
Ting-Rui Chiang
Yi-Ting Yeh
Ta-Chung Chi
Yau-Shian Wang
24
1
0
12 Oct 2021
Waypoint Models for Instruction-guided Navigation in Continuous
  Environments
Waypoint Models for Instruction-guided Navigation in Continuous Environments
Jacob Krantz
Aaron Gokaslan
Dhruv Batra
Stefan Lee
Oleksandr Maksymets
LM&Ro
137
76
0
05 Oct 2021
Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language
  Navigation in Continuous Environments
Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments
Sonia Raychaudhuri
Saim Wani
Shivansh Patel
Unnat Jain
Angel X. Chang
LM&Ro
25
52
0
30 Sep 2021
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLM
VPVLM
VLM
208
221
0
24 Sep 2021
ReaSCAN: Compositional Reasoning in Language Grounding
ReaSCAN: Compositional Reasoning in Language Grounding
Zhengxuan Wu
Elisa Kreiss
Desmond C. Ong
Christopher Potts
CoGe
LRM
29
22
0
18 Sep 2021
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments
  for Embodied AI
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
Santhosh Kumar Ramakrishnan
Aaron Gokaslan
Erik Wijmans
Oleksandr Maksymets
Alexander Clegg
...
Andrew Westbury
Angel X. Chang
Manolis Savva
Yili Zhao
Dhruv Batra
22
368
0
16 Sep 2021
Procedures as Programs: Hierarchical Control of Situated Agents through
  Natural Language
Procedures as Programs: Hierarchical Control of Situated Agents through Natural Language
Shuyan Zhou
Pengcheng Yin
Graham Neubig
LM&Ro
14
1
0
16 Sep 2021
Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense
  Language Understanding
Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense Language Understanding
Shane Storks
Qiaozi Gao
Yichi Zhang
J. Chai
ReLM
LRM
47
22
0
10 Sep 2021
Modular Framework for Visuomotor Language Grounding
Modular Framework for Visuomotor Language Grounding
Kolby Nottingham
Litian Liang
Daeyun Shin
Charless C. Fowlkes
Roy Fox
Sameer Singh
16
12
0
05 Sep 2021
Learning Language-Conditioned Robot Behavior from Offline Data and
  Crowd-Sourced Annotation
Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation
Suraj Nair
E. Mitchell
Kevin Chen
Brian Ichter
Silvio Savarese
Chelsea Finn
LM&Ro
OffRL
37
154
0
02 Sep 2021
SASRA: Semantically-aware Spatio-temporal Reasoning Agent for
  Vision-and-Language Navigation in Continuous Environments
SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments
Muhammad Zubair Irshad
Niluthpol Chowdhury Mithun
Zachary Seymour
Han-Pang Chiu
S. Samarasekera
Rakesh Kumar
LM&Ro
26
49
0
26 Aug 2021
The Surprising Effectiveness of Visual Odometry Techniques for Embodied
  PointGoal Navigation
The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation
Xiaoming Zhao
Harsh Agrawal
Dhruv Batra
A. Schwing
36
40
0
26 Aug 2021
Previous
1234567
Next