Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.00837
Cited By
v1
v2
v3 (latest)
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
2 December 2016
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering"
50 / 2,037 papers shown
Title
REXUP: I REason, I EXtract, I UPdate with Structured Compositional Reasoning for Visual Question Answering
Siwen Luo
S. Han
Kaiyuan Sun
Josiah Poon
CoGe
LRM
ReLM
83
4
0
27 Jul 2020
Contrastive Visual-Linguistic Pretraining
Lei Shi
Kai Shuang
Shijie Geng
Peng Su
Zhengkai Jiang
Peng Gao
Zuohui Fu
Gerard de Melo
Sen Su
VLM
SSL
CLIP
105
29
0
26 Jul 2020
Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data
Michael Cogswell
Jiasen Lu
Rishabh Jain
Stefan Lee
Devi Parikh
Dhruv Batra
VLM
EgoV
78
15
0
24 Jul 2020
Fine-Grained Image Captioning with Global-Local Discriminative Objective
Jie Wu
Tianshui Chen
Hefeng Wu
Zhi Yang
Guangchun Luo
Liang Lin
70
59
0
21 Jul 2020
Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering
Ruixue Tang
Chao Ma
W. Zhang
Qi Wu
Xiaokang Yang
OOD
72
49
0
19 Jul 2020
Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder
K. Gouthaman
Anurag Mittal
98
79
0
13 Jul 2020
IQ-VQA: Intelligent Visual Question Answering
Vatsal Goel
Mohit Chandak
A. Anand
Prithwijit Guha
64
5
0
08 Jul 2020
Targeting the Benchmark: On Methodology in Current Natural Language Processing Research
David Schlangen
74
58
0
07 Jul 2020
What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets
Jianing Yang
Yuying Zhu
Yongxin Wang
Ruitao Yi
Amir Zadeh
Louis-Philippe Morency
69
12
0
07 Jul 2020
A Competence-aware Curriculum for Visual Concepts Learning via Question Answering
Qing Li
Siyuan Huang
Yining Hong
Song-Chun Zhu
119
29
0
03 Jul 2020
The Impact of Explanations on AI Competency Prediction in VQA
Kamran Alipour
Arijit Ray
Xiaoyu Lin
J. Schulze
Yi Yao
Giedrius Burachas
59
9
0
02 Jul 2020
DocVQA: A Dataset for VQA on Document Images
Minesh Mathew
Dimosthenis Karatzas
C. V. Jawahar
180
748
0
01 Jul 2020
Graph Optimal Transport for Cross-Domain Alignment
Liqun Chen
Zhe Gan
Yu Cheng
Linjie Li
Lawrence Carin
Jingjing Liu
OT
127
152
0
26 Jun 2020
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"
Saeed Amizadeh
Hamid Palangi
Oleksandr Polozov
Yichen Huang
K. Koishida
NAI
LRM
121
60
0
20 Jun 2020
Overcoming Statistical Shortcuts for Open-ended Visual Counting
Corentin Dancette
Rémi Cadène
Xinlei Chen
Matthieu Cord
43
3
0
17 Jun 2020
Sparse and Continuous Attention Mechanisms
André F. T. Martins
António Farinhas
Marcos Vinícius Treviso
Vlad Niculae
P. Aguiar
Mário A. T. Figueiredo
85
42
0
12 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
184
437
0
11 Jun 2020
Exploring Weaknesses of VQA Models through Attribution Driven Insights
Shaunak Halbe
47
2
0
11 Jun 2020
Large-Scale Adversarial Training for Vision-and-Language Representation Learning
Zhe Gan
Yen-Chun Chen
Linjie Li
Chen Zhu
Yu Cheng
Jingjing Liu
ObjD
VLM
171
501
0
11 Jun 2020
Estimating semantic structure for the VQA answer space
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
47
4
0
10 Jun 2020
Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
OOD
90
90
0
09 Jun 2020
Counterfactual VQA: A Cause-Effect Look at Language Bias
Yulei Niu
Kaihua Tang
Hanwang Zhang
Zhiwu Lu
Xiansheng Hua
Ji-Rong Wen
CML
151
403
0
08 Jun 2020
Structured Multimodal Attentions for TextVQA
Chenyu Gao
Qi Zhu
Peng Wang
Hui Li
Yuliang Liu
Anton Van Den Hengel
Qi Wu
110
60
0
01 Jun 2020
On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law
Damien Teney
Kushal Kafle
Robik Shrestha
Ehsan Abbasnejad
Christopher Kanan
Anton Van Den Hengel
OODD
OOD
112
147
0
19 May 2020
Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models
Jize Cao
Zhe Gan
Yu Cheng
Licheng Yu
Yen-Chun Chen
Jingjing Liu
VLM
133
130
0
15 May 2020
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA
Hyounghun Kim
Zineng Tang
Joey Tianyi Zhou
80
31
0
13 May 2020
Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond
Zhuosheng Zhang
Hai Zhao
Rui Wang
115
63
0
13 May 2020
Cross-Modality Relevance for Reasoning on Language and Vision
Chen Zheng
Quan Guo
Parisa Kordjamshidi
LRM
91
36
0
12 May 2020
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Douwe Kiela
Hamed Firooz
Aravind Mohan
Vedanuj Goswami
Amanpreet Singh
Pratik Ringshia
Davide Testuggine
113
613
0
10 May 2020
What is Learned in Visually Grounded Neural Syntax Acquisition
Noriyuki Kojima
Hadar Averbuch-Elor
Alexander M. Rush
Yoav Artzi
77
22
0
04 May 2020
Visual Question Answering with Prior Class Semantics
Violetta Shevchenko
Damien Teney
A. Dick
Anton Van Den Hengel
BDL
63
7
0
04 May 2020
Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO
Zarana Parekh
Jason Baldridge
Daniel Cer
Austin Waters
Yinfei Yang
78
62
0
30 Apr 2020
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Arjun Majumdar
Ayush Shrivastava
Stefan Lee
Peter Anderson
Devi Parikh
Dhruv Batra
LM&Ro
210
236
0
30 Apr 2020
STARC: Structured Annotations for Reading Comprehension
Yevgeni Berzak
J. Malmaud
R. Levy
65
24
0
30 Apr 2020
Dynamic Language Binding in Relational Visual Reasoning
T. Le
Vuong Le
Svetha Venkatesh
T. Tran
NAI
71
19
0
30 Apr 2020
Look at the First Sentence: Position Bias in Question Answering
Miyoung Ko
Jinhyuk Lee
Hyunjae Kim
Gangwoo Kim
Jaewoo Kang
FaML
OOD
80
100
0
30 Apr 2020
Explainable Deep Learning: A Field Guide for the Uninitiated
Gabrielle Ras
Ning Xie
Marcel van Gerven
Derek Doran
AAML
XAI
120
382
0
30 Apr 2020
Pragmatic Issue-Sensitive Image Captioning
Allen Nie
Reuben Cohn-Gordon
Christopher Potts
53
24
0
29 Apr 2020
The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions
Xiang Zhou
Yixin Nie
Hao Tan
Joey Tianyi Zhou
113
41
0
28 Apr 2020
A Novel Attention-based Aggregation Function to Combine Vision and Language
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
55
9
0
27 Apr 2020
Deep Multimodal Neural Architecture Search
Zhou Yu
Yuhao Cui
Jun-chen Yu
Meng Wang
Dacheng Tao
Qi Tian
77
100
0
25 Apr 2020
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
Duy-Kien Nguyen
Vedanuj Goswami
Xinlei Chen
71
23
0
24 Apr 2020
Debiasing Skin Lesion Datasets and Models? Not So Fast
Alceu Bissoto
Eduardo Valle
Sandra Avila
102
55
0
23 Apr 2020
Visual Question Answering Using Semantic Information from Image Descriptions
Tasmia Tasrin
Md Sultan al Nahian
Brent Harrison
30
0
0
23 Apr 2020
Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OOD
SSL
CML
99
119
0
20 Apr 2020
Are we pretraining it right? Digging deeper into visio-linguistic pretraining
Amanpreet Singh
Vedanuj Goswami
Devi Parikh
VLM
80
48
0
19 Apr 2020
Multiple Visual-Semantic Embedding for Video Retrieval from Query Sentence
Huy Manh Nguyen
Tomo Miyazaki
Yoshihiro Sugaya
S. Omachi
144
1
0
16 Apr 2020
Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training
Joe Stacey
Pasquale Minervini
Haim Dubossarsky
Sebastian Riedel
Tim Rocktaschel
AI4CE
79
8
0
16 Apr 2020
Shortcut Learning in Deep Neural Networks
Robert Geirhos
J. Jacobsen
Claudio Michaelis
R. Zemel
Wieland Brendel
Matthias Bethge
Felix Wichmann
237
2,074
0
16 Apr 2020
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
...
Houdong Hu
Li Dong
Furu Wei
Yejin Choi
Jianfeng Gao
VLM
259
1,955
0
13 Apr 2020
Previous
1
2
3
...
34
35
36
...
39
40
41
Next