Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.06383
Cited By
How Much Can CLIP Benefit Vision-and-Language Tasks?
13 July 2021
Sheng Shen
Liunian Harold Li
Hao Tan
Joey Tianyi Zhou
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How Much Can CLIP Benefit Vision-and-Language Tasks?"
4 / 104 papers shown
Title
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
252
927
0
24 Sep 2019
Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning
Khanh Nguyen
Hal Daumé
LM&Ro
EgoV
180
150
0
04 Sep 2019
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&Ro
LRM
260
498
0
07 Jun 2018
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
158
1,464
0
06 Jun 2016
Previous
1
2
3