Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.02208
Cited By
Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts
4 June 2024
Haodong Hong
Sen Wang
Zi Huang
Qi Wu
Jiajun Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts"
12 / 12 papers shown
Title
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang
Anyi Rao
Maneesh Agrawala
AI4CE
60
4,015
1
10 Feb 2023
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
Fahad Shahbaz Khan
VPVLM
VLM
240
544
0
06 Oct 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
273
3,458
0
29 Apr 2022
EnvEdit: Environment Editing for Vision-and-Language Navigation
Jialu Li
Hao Tan
Joey Tianyi Zhou
75
81
0
29 Mar 2022
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
LM&Ro
61
141
0
23 Feb 2022
Less is More: Generating Grounded Navigation Instructions from Landmarks
Su Wang
Ceslee Montgomery
Jordi Orbay
Vighnesh Birodkar
Aleksandra Faust
Izzeddin Gur
Natasha Jaques
Austin Waters
Jason Baldridge
Peter Anderson
87
63
0
25 Nov 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Joey Tianyi Zhou
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
242
407
0
13 Jul 2021
Vision-Language Navigation with Random Environmental Mixup
Chong Liu
Fengda Zhu
Xiaojun Chang
Xiaodan Liang
Zongyuan Ge
Yi-Dong Shen
LM&Ro
70
86
0
15 Jun 2021
Towards Light-weight and Real-time Line Segment Detection
Geonmo Gu
ByungSoo Ko
SeoungHyun Go
Sung-Hyun Lee
Jingeun Lee
Minchul Shin
3DGS
20
61
0
01 Jun 2021
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
Alexander Ku
Peter Anderson
Roma Patel
Eugene Ie
Jason Baldridge
62
305
0
15 Oct 2020
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Arjun Majumdar
Ayush Shrivastava
Stefan Lee
Peter Anderson
Devi Parikh
Dhruv Batra
LM&Ro
115
231
0
30 Apr 2020
Matterport3D: Learning from RGB-D Data in Indoor Environments
Angel X. Chang
Angela Dai
Thomas Funkhouser
Maciej Halber
Matthias Nießner
Manolis Savva
Shuran Song
Andy Zeng
Yinda Zhang
3DV
3DPC
106
1,880
0
18 Sep 2017
1