ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.18065
  4. Cited By
Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation

Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation

23 March 2025
Ziming Wei
Bingqian Lin
Yunshuang Nie
Jiaqi Chen
Shikui Ma
Hang Xu
Xiaodan Liang
ArXiv (abs)PDFHTML

Papers citing "Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation"

40 / 40 papers shown
Title
Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation
Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation
P. Zhang
Yifei Su
Pengyuan Wu
Dong An
Li Zhang
Zhigang Wang
Dong Wang
Yan Ding
Bin Zhao
Xuelong Li
LM&Ro
70
0
0
27 May 2025
Correctable Landmark Discovery via Large Models for Vision-Language
  Navigation
Correctable Landmark Discovery via Large Models for Vision-Language Navigation
Bingqian Lin
Yunshuang Nie
Ziming Wei
Yi Zhu
Hang Xu
Shikui Ma
Jianzhuang Liu
Xiaodan Liang
LM&Ro
111
8
0
29 May 2024
MapGPT: Map-Guided Prompting with Adaptive Path Planning for
  Vision-and-Language Navigation
MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation
Jiaqi Chen
Bingqian Lin
Ran Xu
Zhenhua Chai
Xiaodan Liang
Kwan-Yee K. Wong
LM&RoLLMAG
65
31
0
14 Jan 2024
PanoGen: Text-Conditioned Panoramic Environment Generation for
  Vision-and-Language Navigation
PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation
Jialu Li
Joey Tianyi Zhou
DiffM
92
53
0
30 May 2023
InstructBLIP: Towards General-purpose Vision-Language Models with
  Instruction Tuning
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLMVLM
139
2,095
0
11 May 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDaVLMMLLM
569
4,910
0
17 Apr 2023
Tag2Text: Guiding Vision-Language Model via Image Tagging
Tag2Text: Guiding Vision-Language Model via Image Tagging
Xinyu Huang
Youcai Zhang
Jinyu Ma
Weiwei Tian
Rui Feng
Yuejie Zhang
Yaqian Li
Yandong Guo
Lei Zhang
CLIPMLLMVLM3DV
109
76
0
10 Mar 2023
BEVBert: Multimodal Map Pre-training for Language-guided Navigation
BEVBert: Multimodal Map Pre-training for Language-guided Navigation
Dongyan An
Yuankai Qi
Yangguang Li
Yan Huang
Liangsheng Wang
Tieniu Tan
Jing Shao
78
64
0
08 Dec 2022
Learning from Unlabeled 3D Environments for Vision-and-Language
  Navigation
Learning from Unlabeled 3D Environments for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
129
48
0
24 Aug 2022
Target-Driven Structured Transformer Planner for Vision-Language
  Navigation
Target-Driven Structured Transformer Planner for Vision-Language Navigation
Yusheng Zhao
Jinyu Chen
Chen Gao
Wenguan Wang
Lirong Yang
Haibing Ren
Huaxia Xia
Si Liu
LM&Ro
78
60
0
19 Jul 2022
Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous
  Environments
Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments
Jacob Krantz
Stefan Lee
45
37
0
20 Apr 2022
EnvEdit: Environment Editing for Vision-and-Language Navigation
EnvEdit: Environment Editing for Vision-and-Language Navigation
Jialu Li
Hao Tan
Joey Tianyi Zhou
101
83
0
29 Mar 2022
HOP: History-and-Order Aware Pre-training for Vision-and-Language
  Navigation
HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation
Yanyuan Qiao
Yuankai Qi
Yicong Hong
Zheng Yu
Peifeng Wang
Qi Wu
AI4TS
92
75
0
22 Mar 2022
Think Global, Act Local: Dual-scale Graph Transformer for
  Vision-and-Language Navigation
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
LM&Ro
92
147
0
23 Feb 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLMBDLVLMCLIP
555
4,413
0
28 Jan 2022
Contrastive Instruction-Trajectory Learning for Vision-Language
  Navigation
Contrastive Instruction-Trajectory Learning for Vision-Language Navigation
Xiwen Liang
Fengda Zhu
Yi Zhu
Bingqian Lin
Bing Wang
Xiaodan Liang
67
23
0
08 Dec 2021
Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language
  Navigation in Continuous Environments
Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments
Sonia Raychaudhuri
Saim Wani
Shivansh Patel
Unnat Jain
Angel X. Chang
LM&Ro
73
54
0
30 Sep 2021
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments
  for Embodied AI
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
Santhosh Kumar Ramakrishnan
Aaron Gokaslan
Erik Wijmans
Oleksandr Maksymets
Alexander Clegg
...
Andrew Westbury
Angel X. Chang
Manolis Savva
Yili Zhao
Dhruv Batra
87
393
0
16 Sep 2021
Airbert: In-domain Pretraining for Vision-and-Language Navigation
Airbert: In-domain Pretraining for Vision-and-Language Navigation
Pierre-Louis Guhur
Makarand Tapaswi
Shizhe Chen
Ivan Laptev
Cordelia Schmid
LM&Ro
52
142
0
20 Aug 2021
Vision-Language Navigation with Random Environmental Mixup
Vision-Language Navigation with Random Environmental Mixup
Chong Liu
Fengda Zhu
Xiaojun Chang
Xiaodan Liang
Zongyuan Ge
Yi-Dong Shen
LM&Ro
98
87
0
15 Jun 2021
The Road to Know-Where: An Object-and-Room Informed Sequential BERT for
  Indoor Vision-Language Navigation
The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation
Yuankai Qi
Zizheng Pan
Yicong Hong
Ming-Hsuan Yang
Anton Van Den Hengel
Qi Wu
LM&Ro
70
69
0
09 Apr 2021
Scene-Intuitive Agent for Remote Embodied Visual Grounding
Scene-Intuitive Agent for Remote Embodied Visual Grounding
Xiangru Lin
Guanbin Li
Yizhou Yu
LM&Ro
66
52
0
24 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
975
29,871
0
26 Feb 2021
A Recurrent Vision-and-Language BERT for Navigation
A Recurrent Vision-and-Language BERT for Navigation
Yicong Hong
Qi Wu
Yuankai Qi
Cristian Rodriguez-Opazo
Stephen Gould
LM&Ro
104
302
0
26 Nov 2020
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense
  Spatiotemporal Grounding
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Alexander Ku
Peter Anderson
Roma Patel
Eugene Ie
Jason Baldridge
93
314
0
15 Oct 2020
Object-and-Action Aware Model for Visual Language Navigation
Object-and-Action Aware Model for Visual Language Navigation
Yuankai Qi
Zizheng Pan
Shengping Zhang
Anton Van Den Hengel
Qi Wu
LM&Ro
53
113
0
29 Jul 2020
Soft Expert Reward Learning for Vision-and-Language Navigation
Soft Expert Reward Learning for Vision-and-Language Navigation
Hu Wang
Qi Wu
Chunhua Shen
53
51
0
21 Jul 2020
Evolving Graphical Planner: Contextual Global Planning for
  Vision-and-Language Navigation
Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation
Zhiwei Deng
Karthik Narasimhan
Olga Russakovsky
72
88
0
11 Jul 2020
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
...
Houdong Hu
Li Dong
Furu Wei
Yejin Choi
Jianfeng Gao
VLM
140
1,944
0
13 Apr 2020
Towards Learning a Generic Agent for Vision-and-Language Navigation via
  Pre-training
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
Weituo Hao
Chunyuan Li
Xiujun Li
Lawrence Carin
Jianfeng Gao
LM&Ro
90
280
0
25 Feb 2020
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning
  Tasks
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks
Fengda Zhu
Yi Zhu
Xiaojun Chang
Xiaodan Liang
LRM
70
242
0
18 Nov 2019
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal
  Pre-training
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
Gen Li
Nan Duan
Yuejian Fang
Ming Gong
Daxin Jiang
Ming Zhou
SSLVLMMLLM
211
905
0
16 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
153
1,965
0
09 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for
  Vision-and-Language Tasks
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSLVLM
243
3,695
0
06 Aug 2019
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor
  Environments
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
Yuankai Qi
Qi Wu
Peter Anderson
Xinze Wang
Wenjie Wang
Chunhua Shen
Anton Van Den Hengel
LM&Ro
107
330
0
23 Apr 2019
Learning to Navigate Unseen Environments: Back Translation with
  Environmental Dropout
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
Hao Tan
Licheng Yu
Joey Tianyi Zhou
SSL
88
322
0
08 Apr 2019
Habitat: A Platform for Embodied AI Research
Habitat: A Platform for Embodied AI Research
Manolis Savva
Abhishek Kadian
Oleksandr Maksymets
Yili Zhao
Erik Wijmans
...
Jia-Wei Liu
V. Koltun
Jitendra Malik
Devi Parikh
Dhruv Batra
LM&Ro
126
1,423
0
02 Apr 2019
Speaker-Follower Models for Vision-and-Language Navigation
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&RoLRM
319
505
0
07 Jun 2018
Matterport3D: Learning from RGB-D Data in Indoor Environments
Matterport3D: Learning from RGB-D Data in Indoor Environments
Angel X. Chang
Angela Dai
Thomas Funkhouser
Maciej Halber
Matthias Nießner
Manolis Savva
Shuran Song
Andy Zeng
Yinda Zhang
3DV3DPC
208
1,917
0
18 Sep 2017
A Reduction of Imitation Learning and Structured Prediction to No-Regret
  Online Learning
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
Stéphane Ross
Geoffrey J. Gordon
J. Andrew Bagnell
OffRL
254
3,238
0
02 Nov 2010
1