Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.08896
Cited By
Achieving Human Parity on Visual Question Answering
17 November 2021
Ming Yan
Haiyang Xu
Chenliang Li
Junfeng Tian
Bin Bi
Wei Wang
Weihua Chen
Xianzhe Xu
Fan Wang
Zhen Cao
Zhicheng Zhang
Qiyu Zhang
Ji Zhang
Songfang Huang
Fei Huang
Luo Si
Rong Jin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Achieving Human Parity on Visual Question Answering"
8 / 8 papers shown
Title
Adaptive loose optimization for robust question answering
Jie Ma
Pinghui Wang
Ze-you Wang
Dechen Kong
Min Hu
Tingxu Han
Jun Liu
OOD
38
4
0
06 May 2023
One does not fit all! On the Complementarity of Vision Encoders for Vision and Language Tasks
Gregor Geigle
Chen Cecilia Liu
Jonas Pfeiffer
Iryna Gurevych
VLM
28
1
0
12 Oct 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
46
3,349
0
29 Apr 2022
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
293
1,084
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
310
3,708
0
11 Feb 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
297
10,220
0
16 Nov 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
158
1,464
0
06 Jun 2016
1