Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.08409
Cited By
Lana: A Language-Capable Navigator for Instruction Following and Generation
15 March 2023
Xiaohan Wang
Wenguan Wang
Jiayi Shao
Yi Yang
LLMAG
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Lana: A Language-Capable Navigator for Instruction Following and Generation"
14 / 14 papers shown
Title
DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation
Yinfeng Yu
Dongsheng Yang
22
0
0
30 Apr 2025
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Junrong Yue
Yuhang Zhang
Chuan Qin
Jing Chen
Xiaomin Lie
Xinlei Yu
Wenxin Zhang
Zhendong Zhao
54
0
0
23 Apr 2025
Intelligent LiDAR Navigation: Leveraging External Information and Semantic Maps with LLM as Copilot
Fujing Xie
Jiajie Zhang
Sören Schwertfeger
35
1
0
13 Sep 2024
Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis
Vishnu Sashank Dorbala
Sanjoy Chowdhury
Dinesh Manocha
LM&Ro
35
0
0
18 Mar 2024
Verifiably Following Complex Robot Instructions with Foundation Models
Benedict Quartey
Eric Rosen
Stefanie Tellex
George Konidaris
LM&Ro
44
11
0
18 Feb 2024
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation
Hanqing Wang
Wei Liang
Luc Van Gool
Wenguan Wang
LM&Ro
33
28
0
14 Aug 2023
PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation
Jialu Li
Joey Tianyi Zhou
DiffM
31
49
0
30 May 2023
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Wenhao Wu
Xiaohan Wang
Haipeng Luo
Jingdong Wang
Yi Yang
Wanli Ouyang
98
48
0
31 Dec 2022
Towards Versatile Embodied Navigation
H. Wang
Wei Liang
Luc Van Gool
Wenguan Wang
LM&Ro
47
20
0
30 Oct 2022
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
Chen Liang
Wenguan Wang
Tianfei Zhou
Jiaxu Miao
Yawei Luo
Yi Yang
VOS
29
74
0
18 Mar 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
Xiaohan Wang
Linchao Zhu
Yi Yang
164
170
0
20 Apr 2021
Language and Visual Entity Relationship Graph for Agent Navigation
Yicong Hong
Cristian Rodriguez-Opazo
Yuankai Qi
Qi Wu
Stephen Gould
LM&Ro
176
132
0
19 Oct 2020
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&Ro
LRM
260
496
0
07 Jun 2018
1