ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.07954
  4. Cited By
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense
  Spatiotemporal Grounding

Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

15 October 2020
Alexander Ku
Peter Anderson
Roma Patel
Eugene Ie
Jason Baldridge
ArXivPDFHTML

Papers citing "Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding"

50 / 223 papers shown
Title
lilGym: Natural Language Visual Reasoning with Reinforcement Learning
lilGym: Natural Language Visual Reasoning with Reinforcement Learning
Anne Wu
Kianté Brantley
Noriyuki Kojima
Yoav Artzi
ReLM
OffRL
LRM
27
3
0
03 Nov 2022
Multilingual Multimodality: A Taxonomical Survey of Datasets,
  Techniques, Challenges and Opportunities
Multilingual Multimodality: A Taxonomical Survey of Datasets, Techniques, Challenges and Opportunities
Khyathi Raghavi Chandu
A. Geramifard
40
3
0
30 Oct 2022
ULN: Towards Underspecified Vision-and-Language Navigation
ULN: Towards Underspecified Vision-and-Language Navigation
Weixi Feng
Tsu-jui Fu
Yujie Lu
William Yang Wang
49
5
0
18 Oct 2022
Retrospectives on the Embodied AI Workshop
Retrospectives on the Embodied AI Workshop
Matt Deitke
Dhruv Batra
Yonatan Bisk
Tommaso Campari
Angel X. Chang
...
Jesse Thomason
Alexander Toshev
Joanne Truong
Luca Weihs
Jiajun Wu
LM&Ro
37
51
0
13 Oct 2022
CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory
CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory
Nur Muhammad (Mahi) Shafiullah
Chris Paxton
Lerrel Pinto
Soumith Chintala
Arthur Szlam
VLM
LM&Ro
CLIP
95
156
0
11 Oct 2022
A New Path: Scaling Vision-and-Language Navigation with Synthetic
  Instructions and Imitation Learning
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
Aishwarya Kamath
Peter Anderson
Su Wang
Jing Yu Koh
Alexander Ku
Austin Waters
Yinfei Yang
Jason Baldridge
Zarana Parekh
LM&Ro
22
45
0
06 Oct 2022
Iterative Vision-and-Language Navigation
Iterative Vision-and-Language Navigation
Jacob Krantz
Shurjo Banerjee
Wang Zhu
Jason J. Corso
Peter Anderson
Stefan Lee
Jesse Thomason
LM&Ro
40
18
0
06 Oct 2022
Improving Policy Learning via Language Dynamics Distillation
Improving Policy Learning via Language Dynamics Distillation
Victor Zhong
Jesse Mu
Luke Zettlemoyer
Edward Grefenstette
Tim Rocktaschel
OffRL
41
15
0
30 Sep 2022
MUG: Interactive Multimodal Grounding on User Interfaces
MUG: Interactive Multimodal Grounding on User Interfaces
Tao Li
Gang Li
Jingjie Zheng
Purple Wang
Yang Li
LLMAG
35
8
0
29 Sep 2022
Human-in-the-loop Robotic Grasping using BERT Scene Representation
Human-in-the-loop Robotic Grasping using BERT Scene Representation
Yaoxian Song
Penglei Sun
Pengfei Fang
Linyi Yang
Yanghua Xiao
Yue Zhang
73
5
0
28 Sep 2022
MaXM: Towards Multilingual Visual Question Answering
MaXM: Towards Multilingual Visual Question Answering
Soravit Changpinyo
Linting Xue
Michal Yarom
Ashish V. Thapliyal
Idan Szpektor
J. Amelot
Xi Chen
Radu Soricut
33
8
0
12 Sep 2022
Anticipating the Unseen Discrepancy for Vision and Language Navigation
Anticipating the Unseen Discrepancy for Vision and Language Navigation
Yujie Lu
Huiliang Zhang
Ping Nie
Weixi Feng
Wenda Xu
Qing Guo
William Yang Wang
35
1
0
10 Sep 2022
JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for
  Conversational Embodied Agents
JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents
Kai Zheng
KAI-QING Zhou
Jing Gu
Yue Fan
Jialu Wang
Zong-xiao Li
Xuehai He
Qing Guo
LM&Ro
30
39
0
28 Aug 2022
Learning from Unlabeled 3D Environments for Vision-and-Language
  Navigation
Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
58
46
0
24 Aug 2022
Target-Driven Structured Transformer Planner for Vision-Language
  Navigation
Target-Driven Structured Transformer Planner for Vision-Language Navigation
Yusheng Zhao
Jinyu Chen
Chen Gao
Wenguan Wang
Lirong Yang
Haibing Ren
Huaxia Xia
Si Liu
LM&Ro
27
57
0
19 Jul 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
  Vision, and Action
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
158
436
0
10 Jul 2022
CLEAR: Improving Vision-Language Navigation with Cross-Lingual,
  Environment-Agnostic Representations
CLEAR: Improving Vision-Language Navigation with Cross-Lingual, Environment-Agnostic Representations
Jialu Li
Hao Tan
Joey Tianyi Zhou
LM&Ro
64
12
0
05 Jul 2022
1st Place Solutions for RxR-Habitat Vision-and-Language Navigation
  Competition (CVPR 2022)
1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022)
Dongyan An
Zun Wang
Yangguang Li
Yi Wang
Yicong Hong
Yan Huang
Liang Wang
Jing Shao
29
14
0
23 Jun 2022
Local Slot Attention for Vision-and-Language Navigation
Local Slot Attention for Vision-and-Language Navigation
Yifeng Zhuang
Qiang Sun
Yanwei Fu
Lifeng Chen
Xiangyang Xue
21
2
0
17 Jun 2022
FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
Zi-Yi Dou
Nanyun Peng
24
22
0
09 Jun 2022
ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts
ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts
Bingqian Lin
Yi Zhu
Zicong Chen
Xiwen Liang
Jian-zhuo Liu
Xiaodan Liang
LM&Ro
25
51
0
31 May 2022
Training and Inference on Any-Order Autoregressive Models the Right Way
Training and Inference on Any-Order Autoregressive Models the Right Way
Andy Shih
Dorsa Sadigh
Stefano Ermon
BDL
TPM
OOD
CML
35
23
0
26 May 2022
Aerial Vision-and-Dialog Navigation
Aerial Vision-and-Dialog Navigation
Yue Fan
Winson X. Chen
Tongzhou Jiang
Chun-ni Zhou
Yi Zhang
Qing Guo
46
19
0
24 May 2022
On the Limits of Evaluating Embodied Agent Model Generalization Using
  Validation Sets
On the Limits of Evaluating Embodied Agent Model Generalization Using Validation Sets
Hyounghun Kim
Aishwarya Padmakumar
Di Jin
Joey Tianyi Zhou
Dilek Z. Hakkani-Tür
11
0
0
18 May 2022
EnvEdit: Environment Editing for Vision-and-Language Navigation
EnvEdit: Environment Editing for Vision-and-Language Navigation
Jialu Li
Hao Tan
Joey Tianyi Zhou
31
80
0
29 Mar 2022
FedVLN: Privacy-preserving Federated Vision-and-Language Navigation
FedVLN: Privacy-preserving Federated Vision-and-Language Navigation
Kaiwen Zhou
Qing Guo
FedML
26
8
0
28 Mar 2022
Analyzing Generalization of Vision and Language Navigation to Unseen
  Outdoor Areas
Analyzing Generalization of Vision and Language Navigation to Unseen Outdoor Areas
Raphael Schumann
Stefan Riezler
21
26
0
25 Mar 2022
Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future
  Directions
Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions
Jing Gu
Eliana Stefani
Qi Wu
Jesse Thomason
Qing Guo
LM&Ro
30
104
0
22 Mar 2022
HOP: History-and-Order Aware Pre-training for Vision-and-Language
  Navigation
HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation
Yanyuan Qiao
Yuankai Qi
Yicong Hong
Zheng Yu
Peifeng Wang
Qi Wu
AI4TS
27
71
0
22 Mar 2022
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot
  Object Navigation
CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation
S. Gadre
Mitchell Wortsman
Gabriel Ilharco
Ludwig Schmidt
Shuran Song
CLIP
LM&Ro
35
142
0
20 Mar 2022
Cross-modal Map Learning for Vision and Language Navigation
Cross-modal Map Learning for Vision and Language Navigation
G. Georgakis
Karl Schmeckpeper
Karan Wanchoo
Soham Dan
E. Miltsakaki
Dan Roth
Kostas Daniilidis
22
64
0
10 Mar 2022
Bridging the Gap Between Learning in Discrete and Continuous
  Environments for Vision-and-Language Navigation
Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation
Yicong Hong
Zun Wang
Qi Wu
Stephen Gould
3DV
29
64
0
05 Mar 2022
Think Global, Act Local: Dual-scale Graph Transformer for
  Vision-and-Language Navigation
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
LM&Ro
28
139
0
23 Feb 2022
A Dataset for Interactive Vision-Language Navigation with Unknown
  Command Feasibility
A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility
Andrea Burns
Deniz Arsan
Sanjna Agrawal
Ranjitha Kumar
Kate Saenko
Bryan A. Plummer
44
59
0
04 Feb 2022
Contrastive Instruction-Trajectory Learning for Vision-Language
  Navigation
Contrastive Instruction-Trajectory Learning for Vision-Language Navigation
Xiwen Liang
Fengda Zhu
Yi Zhu
Bingqian Lin
Bing Wang
Xiaodan Liang
24
23
0
08 Dec 2021
Explore the Potential Performance of Vision-and-Language Navigation
  Model: a Snapshot Ensemble Method
Explore the Potential Performance of Vision-and-Language Navigation Model: a Snapshot Ensemble Method
Wenda Qin
Teruhisa Misu
Derry Wijaya
UQCV
LM&Ro
27
5
0
28 Nov 2021
Less is More: Generating Grounded Navigation Instructions from Landmarks
Less is More: Generating Grounded Navigation Instructions from Landmarks
Su Wang
Ceslee Montgomery
Jordi Orbay
Vighnesh Birodkar
Aleksandra Faust
Izzeddin Gur
Natasha Jaques
Austin Waters
Jason Baldridge
Peter Anderson
20
63
0
25 Nov 2021
Curriculum Learning for Vision-and-Language Navigation
Curriculum Learning for Vision-and-Language Navigation
Jiwen Zhang
Zhongyu Wei
Jianqing Fan
J. Peng
LM&Ro
26
21
0
14 Nov 2021
Multimodal Transformer with Variable-length Memory for
  Vision-and-Language Navigation
Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation
Chuang Lin
Yi-Xin Jiang
Jianfei Cai
Lizhen Qu
Gholamreza Haffari
Zehuan Yuan
28
32
0
10 Nov 2021
LILA: Language-Informed Latent Actions
LILA: Language-Informed Latent Actions
Siddharth Karamcheti
Megha Srivastava
Percy Liang
Dorsa Sadigh
LM&Ro
30
31
0
05 Nov 2021
SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language
  Navigation
SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation
A. Moudgil
Arjun Majumdar
Harsh Agrawal
Stefan Lee
Dhruv Batra
LM&Ro
27
57
0
27 Oct 2021
History Aware Multimodal Transformer for Vision-and-Language Navigation
History Aware Multimodal Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Cordelia Schmid
Ivan Laptev
LM&Ro
28
225
0
25 Oct 2021
SILG: The Multi-environment Symbolic Interactive Language Grounding
  Benchmark
SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark
Victor Zhong
Austin W. Hanjie
Sida Wang
Karthik Narasimhan
Luke Zettlemoyer
19
12
0
20 Oct 2021
Rethinking the Spatial Route Prior in Vision-and-Language Navigation
Rethinking the Spatial Route Prior in Vision-and-Language Navigation
Xinzhe Zhou
Wei Liu
Yadong Mu
16
6
0
12 Oct 2021
Natural Language for Human-Robot Collaboration: Problems Beyond Language
  Grounding
Natural Language for Human-Robot Collaboration: Problems Beyond Language Grounding
Seth Pate
Wei-ping Xu
Ziyi Yang
Maxwell Love
Siddarth Ganguri
Lawson L. S. Wong
17
7
0
09 Oct 2021
Waypoint Models for Instruction-guided Navigation in Continuous
  Environments
Waypoint Models for Instruction-guided Navigation in Continuous Environments
Jacob Krantz
Aaron Gokaslan
Dhruv Batra
Stefan Lee
Oleksandr Maksymets
LM&Ro
137
76
0
05 Oct 2021
Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language
  Navigation in Continuous Environments
Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments
Sonia Raychaudhuri
Saim Wani
Shivansh Patel
Unnat Jain
Angel X. Chang
LM&Ro
25
52
0
30 Sep 2021
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments
  for Embodied AI
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
Santhosh Kumar Ramakrishnan
Aaron Gokaslan
Erik Wijmans
Oleksandr Maksymets
Alexander Clegg
...
Andrew Westbury
Angel X. Chang
Manolis Savva
Yili Zhao
Dhruv Batra
22
368
0
16 Sep 2021
Vision-Language Navigation: A Survey and Taxonomy
Vision-Language Navigation: A Survey and Taxonomy
Wansen Wu
Tao Chang
Xinmeng Li
LM&Ro
17
19
0
26 Aug 2021
Airbert: In-domain Pretraining for Vision-and-Language Navigation
Airbert: In-domain Pretraining for Vision-and-Language Navigation
Pierre-Louis Guhur
Makarand Tapaswi
Shizhe Chen
Ivan Laptev
Cordelia Schmid
LM&Ro
24
135
0
20 Aug 2021
Previous
12345
Next