ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.03454
  4. Cited By
Structured Scene Memory for Vision-Language Navigation

Structured Scene Memory for Vision-Language Navigation

5 March 2021
Hanqing Wang
Wenguan Wang
Wei Liang
Caiming Xiong
Jianbing Shen
    LM&Ro
ArXivPDFHTML

Papers citing "Structured Scene Memory for Vision-Language Navigation"

50 / 76 papers shown
Title
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Junrong Yue
Y. Zhang
Chuan Qin
Bo Li
Xiaomin Lie
Xinlei Yu
Wenxin Zhang
Zhendong Zhao
54
0
0
23 Apr 2025
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Xiaofeng Han
Shunpeng Chen
Zenghuang Fu
Zhe Feng
Lue Fan
...
Li Guo
Weiliang Meng
Xiaopeng Zhang
Rongtao Xu
Shibiao Xu
68
1
0
03 Apr 2025
Observation-Graph Interaction and Key-Detail Guidance for Vision and Language Navigation
Yifan Xie
Binkai Ou
Fei Ma
Yaohua Liu
47
0
0
14 Mar 2025
PanoGen++: Domain-Adapted Text-Guided Panoramic Environment Generation for Vision-and-Language Navigation
Sen Wang
Dongliang Zhou
Liang Xie
Chao Xu
Ye Yan
Erwei Yin
DiffM
75
2
0
13 Mar 2025
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Yue Zhang
Ziqiao Ma
Jialu Li
Yanyuan Qiao
Zun Wang
J. Chai
Qi Wu
Mohit Bansal
Parisa Kordjamshidi
LRM
63
18
0
31 Dec 2024
Vision-Language Navigation with Energy-Based Policy
Vision-Language Navigation with Energy-Based Policy
Rui Liu
Wenguan Wang
Y. Yang
40
3
0
18 Oct 2024
MO-DDN: A Coarse-to-Fine Attribute-based Exploration Agent for
  Multi-object Demand-driven Navigation
MO-DDN: A Coarse-to-Fine Attribute-based Exploration Agent for Multi-object Demand-driven Navigation
Hongcheng Wang
Peiqi Liu
Wenzhe Cai
Mingdong Wu
Zhengyu Qian
Hao Dong
21
0
0
04 Oct 2024
Resolving Positional Ambiguity in Dialogues by Vision-Language Models
  for Robot Navigation
Resolving Positional Ambiguity in Dialogues by Vision-Language Models for Robot Navigation
Kuan-Lin Chen
Tzu-Ti Wei
Li-Tzu Yeh
Elaine Kao
Yu-Chee Tseng
Jen-Jee Chen
LM&Ro
24
0
0
30 Sep 2024
StratXplore: Strategic Novelty-seeking and Instruction-aligned
  Exploration for Vision and Language Navigation
StratXplore: Strategic Novelty-seeking and Instruction-aligned Exploration for Vision and Language Navigation
Muraleekrishna Gopinathan
Jumana Abu-Khalaf
David Suter
Martin Masek
29
0
0
09 Sep 2024
Vision-Language Navigation with Continual Learning
Vision-Language Navigation with Continual Learning
Zhiyuan Li
Yanfeng Lv
Ziqin Tu
Di Shang
Hong Qiao
37
2
0
04 Sep 2024
Narrowing the Gap between Vision and Action in Navigation
Narrowing the Gap between Vision and Action in Navigation
Yue Zhang
Parisa Kordjamshidi
28
2
0
19 Aug 2024
Navigation Instruction Generation with BEV Perception and Large Language
  Models
Navigation Instruction Generation with BEV Perception and Large Language Models
Sheng Fan
Rui Liu
Wenguan Wang
Yi Yang
42
5
0
21 Jul 2024
PRET: Planning with Directed Fidelity Trajectory for Vision and Language
  Navigation
PRET: Planning with Directed Fidelity Trajectory for Vision and Language Navigation
Renjie Lu
Jingke Meng
Wei-Shi Zheng
33
3
0
16 Jul 2024
Controllable Navigation Instruction Generation with Chain of Thought
  Prompting
Controllable Navigation Instruction Generation with Chain of Thought Prompting
Xianghao Kong
Jinyu Chen
Wenguan Wang
Hang Su
Xiaolin Hu
Yi Yang
Si Liu
LRM
42
4
0
10 Jul 2024
Augmented Commonsense Knowledge for Remote Object Grounding
Augmented Commonsense Knowledge for Remote Object Grounding
Bahram Mohammadi
Yicong Hong
Yuankai Qi
Qi Wu
Shirui Pan
J. Shi
38
7
0
03 Jun 2024
Vision-and-Language Navigation Generative Pretrained Transformer
Vision-and-Language Navigation Generative Pretrained Transformer
Hanlin Wen
LM&Ro
27
0
0
27 May 2024
AIGeN: An Adversarial Approach for Instruction Generation in VLN
AIGeN: An Adversarial Approach for Instruction Generation in VLN
Niyati Rawal
Roberto Bigazzi
Lorenzo Baraldi
Rita Cucchiara
GAN
46
4
0
15 Apr 2024
DELAN: Dual-Level Alignment for Vision-and-Language Navigation by
  Cross-Modal Contrastive Learning
DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning
Mengfei Du
Binhao Wu
Jiwen Zhang
Zhihao Fan
Zejun Li
Ruipu Luo
Xuanjing Huang
Zhongyu Wei
33
3
0
02 Apr 2024
Temporal-Spatial Object Relations Modeling for Vision-and-Language
  Navigation
Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation
Bowen Huang
Yanwei Zheng
Chuanlin Lan
Xinpeng Zhao
Yifei Zou
Dongxiao Yu
36
0
0
23 Mar 2024
Volumetric Environment Representation for Vision-Language Navigation
Volumetric Environment Representation for Vision-Language Navigation
Rui Liu
Wenguan Wang
Yi Yang
34
25
0
21 Mar 2024
Hierarchical Spatial Proximity Reasoning for Vision-and-Language
  Navigation
Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation
Ming Xu
Zilong Xie
33
2
0
18 Mar 2024
Vision-Language Navigation with Embodied Intelligence: A Survey
Peng Gao
Peng Wang
Feng Gao
Fei-Yue Wang
Ruyue Yuan
LM&Ro
37
2
0
22 Feb 2024
NavHint: Vision and Language Navigation Agent with a Hint Generator
NavHint: Vision and Language Navigation Agent with a Hint Generator
Yue Zhang
Quan Guo
Parisa Kordjamshidi
LLMAG
30
9
0
04 Feb 2024
DGMem: Learning Visual Navigation Policy without Any Labels by Dynamic
  Graph Memory
DGMem: Learning Visual Navigation Policy without Any Labels by Dynamic Graph Memory
Wenzhe Cai
Teng Wang
Guangran Cheng
Lele Xu
Changyin Sun
19
1
0
30 Nov 2023
Robust Navigation with Cross-Modal Fusion and Knowledge Transfer
Robust Navigation with Cross-Modal Fusion and Knowledge Transfer
Wenzhe Cai
Guangran Cheng
Lingyue Kong
Lu Dong
Changyin Sun
26
1
0
23 Sep 2023
Discuss Before Moving: Visual Language Navigation via Multi-expert
  Discussions
Discuss Before Moving: Visual Language Navigation via Multi-expert Discussions
Yuxing Long
Xiaoqi Li
Wenzhe Cai
Hao Dong
LLMAG
LM&Ro
19
44
0
20 Sep 2023
Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language
  Navigation
Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation
Yibo Cui
Liang Xie
Yakun Zhang
Meishan Zhang
Ye Yan
Erwei Yin
LM&Ro
29
16
0
24 Aug 2023
Omnidirectional Information Gathering for Knowledge Transfer-based
  Audio-Visual Navigation
Omnidirectional Information Gathering for Knowledge Transfer-based Audio-Visual Navigation
Jinyu Chen
Wenguan Wang
Siying Liu
Hongsheng Li
Yi Yang
20
8
0
20 Aug 2023
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation
Hanqing Wang
Wei Liang
Luc Van Gool
Wenguan Wang
LM&Ro
30
28
0
14 Aug 2023
Bird's-Eye-View Scene Graph for Vision-Language Navigation
Bird's-Eye-View Scene Graph for Vision-Language Navigation
Ruitao Liu
Xiaohan Wang
Wenguan Wang
Yi Yang
15
50
0
09 Aug 2023
Scaling Data Generation in Vision-and-Language Navigation
Scaling Data Generation in Vision-and-Language Navigation
Zun Wang
Jialu Li
Yicong Hong
Yi Wang
Qi Wu
Mohit Bansal
Stephen Gould
Hao Tan
Yu Qiao
LM&Ro
34
56
0
28 Jul 2023
Active Robot Vision for Distant Object Change Detection: A Lightweight
  Training Simulator Inspired by Multi-Armed Bandits
Active Robot Vision for Distant Object Change Detection: A Lightweight Training Simulator Inspired by Multi-Armed Bandits
Kouki Terashima
Kanji Tanaka
Ryo Yamamoto
Jonathan Tay Yu Liang
26
2
0
26 Jul 2023
Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for
  Navigation Instruction Generation
Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for Navigation Instruction Generation
Haitian Zeng
Xiaohan Wang
Wenguan Wang
Yi Yang
21
7
0
25 Jul 2023
GridMM: Grid Memory Map for Vision-and-Language Navigation
GridMM: Grid Memory Map for Vision-and-Language Navigation
Zihan Wang
Xiangyang Li
Jiahao Yang
Yeqi Liu
Shuqiang Jiang
33
52
0
24 Jul 2023
PanoGen: Text-Conditioned Panoramic Environment Generation for
  Vision-and-Language Navigation
PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation
Jialu Li
Mohit Bansal
DiffM
29
49
0
30 May 2023
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large
  Language Models
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Gengze Zhou
Yicong Hong
Qi Wu
ELM
LM&Ro
LLMAG
LRM
25
141
0
26 May 2023
A Multi-modal Approach to Single-modal Visual Place Classification
A Multi-modal Approach to Single-modal Visual Place Classification
Tomoya Iwasaki
Kanji Tanaka
Kenta Tsukahara
21
0
0
10 May 2023
Active Semantic Localization with Graph Neural Embedding
Active Semantic Localization with Graph Neural Embedding
Mitsuki Yoshida
Kanji Tanaka
Ryo Yamamoto
Daiki Iwata
22
1
0
10 May 2023
A Dual Semantic-Aware Recurrent Global-Adaptive Network For
  Vision-and-Language Navigation
A Dual Semantic-Aware Recurrent Global-Adaptive Network For Vision-and-Language Navigation
Liuyi Wang
Zongtao He
Jiagui Tang
Ronghao Dang
Naijia Wang
Chengju Liu
Qi Chen
22
17
0
05 May 2023
Improving Vision-and-Language Navigation by Generating Future-View Image
  Semantics
Improving Vision-and-Language Navigation by Generating Future-View Image Semantics
Jialu Li
Mohit Bansal
23
34
0
11 Apr 2023
ETPNav: Evolving Topological Planning for Vision-Language Navigation in
  Continuous Environments
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments
Dongyan An
H. Wang
Wenguan Wang
Zun Wang
Yan Huang
Keji He
Liang Wang
58
63
0
06 Apr 2023
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
Xiangyang Li
Zihan Wang
Jiahao Yang
Yaowei Wang
Shuqiang Jiang
LM&Ro
13
38
0
28 Mar 2023
Lana: A Language-Capable Navigator for Instruction Following and
  Generation
Lana: A Language-Capable Navigator for Instruction Following and Generation
Xiaohan Wang
Wenguan Wang
Jiayi Shao
Yi Yang
LLMAG
LM&Ro
36
38
0
15 Mar 2023
Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation
  Using Scene Object Spectrum Grounding
Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding
Minyoung Hwang
Jaeyeon Jeong
Minsoo Kim
Yoonseon Oh
Songhwai Oh
19
19
0
07 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
33
13
0
07 Mar 2023
ESceme: Vision-and-Language Navigation with Episodic Scene Memory
ESceme: Vision-and-Language Navigation with Episodic Scene Memory
Qinjie Zheng
Daqing Liu
Chaoyue Wang
Jing Zhang
Dadong Wang
Dacheng Tao
LM&Ro
33
5
0
02 Mar 2023
RREx-BoT: Remote Referring Expressions with a Bag of Tricks
RREx-BoT: Remote Referring Expressions with a Bag of Tricks
Gunnar A. Sigurdsson
Jesse Thomason
Gaurav Sukhatme
Robinson Piramuthu
LM&Ro
25
8
0
30 Jan 2023
Graph based Environment Representation for Vision-and-Language
  Navigation in Continuous Environments
Graph based Environment Representation for Vision-and-Language Navigation in Continuous Environments
Ting Wang
Zongkai Wu
Feiyu Yao
Donglin Wang
51
5
0
11 Jan 2023
BEVBert: Multimodal Map Pre-training for Language-guided Navigation
BEVBert: Multimodal Map Pre-training for Language-guided Navigation
Dongyan An
Yuankai Qi
Yangguang Li
Yan Huang
Liangsheng Wang
T. Tan
Jing Shao
35
58
0
08 Dec 2022
CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation
CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation
Vishnu Sashank Dorbala
Gunnar A. Sigurdsson
Robinson Piramuthu
Jesse Thomason
Gaurav Sukhatme
LM&Ro
31
55
0
30 Nov 2022
12
Next