Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.00837
Cited By
v1
v2
v3 (latest)
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
2 December 2016
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering"
50 / 2,037 papers shown
Title
Visual Grounding Methods for VQA are Working for the Wrong Reasons!
Robik Shrestha
Kushal Kafle
Christopher Kanan
CML
66
35
0
12 Apr 2020
An Entropy Clustering Approach for Assessing Visual Question Difficulty
K. Terao
Toru Tamaki
B. Raytchev
K. Kaneda
Shuníchi Satoh
OOD
AAML
67
1
0
12 Apr 2020
Rephrasing visual questions by specifying the entropy of the answer distribution
K. Terao
Toru Tamaki
B. Raytchev
K. Kaneda
S. Satoh
OOD
49
2
0
10 Apr 2020
Learning to Scale Multilingual Representations for Vision-Language Tasks
Andrea Burns
Donghyun Kim
Derry Wijaya
Kate Saenko
Bryan A. Plummer
50
35
0
09 Apr 2020
Understanding Knowledge Gaps in Visual Question Answering: Implications for Gap Identification and Testing
Goonmeet Bajaj
Bortik Bandyopadhyay
Daniela Schmidt
Pranav Maneriker
Christopher Myers
Srinivasan Parthasarathy
39
2
0
08 Apr 2020
SHOP-VRB: A Visual Reasoning Benchmark for Object Perception
Michal Nazarczuk
K. Mikolajczyk
72
21
0
06 Apr 2020
Generating Rationales in Visual Question Answering
Hammad A. Ayyubi
Md. Mehrab Tanjim
Julian McAuley
G. Cottrell
LRM
47
6
0
04 Apr 2020
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
ViT
200
440
0
02 Apr 2020
DSTC8-AVSD: Multimodal Semantic Transformer Network with Retrieval Style Word Generator
Hwanhee Lee
Seunghyun Yoon
Franck Dernoncourt
Doo Soon Kim
Trung Bui
Kyomin Jung
69
15
0
01 Apr 2020
Ontology-based Interpretable Machine Learning for Textual Data
Phung Lai
Nhathai Phan
Han Hu
Anuja Badeti
David Newman
Dejing Dou
33
8
0
01 Apr 2020
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text
Difei Gao
Ke Li
Ruiping Wang
Shiguang Shan
Xilin Chen
97
113
0
31 Mar 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLM
VGen
124
70
0
25 Mar 2020
Linguistically Driven Graph Capsule Network for Visual Question Reasoning
Qingxing Cao
Xiaodan Liang
Keze Wang
Liang Lin
GNN
52
3
0
23 Mar 2020
Visual Question Answering for Cultural Heritage
P. Bongini
Federico Becattini
Andrew D. Bagdanov
A. Bimbo
484
24
0
22 Mar 2020
RSVQA: Visual Question Answering for Remote Sensing Data
Sylvain Lobry
Diego Marcos
J. Murray
D. Tuia
129
223
0
16 Mar 2020
Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI
L. Arras
Ahmed Osman
Wojciech Samek
XAI
AAML
97
157
0
16 Mar 2020
Counterfactual Samples Synthesizing for Robust Visual Question Answering
Long Chen
Xin Yan
Jun Xiao
Hanwang Zhang
Shiliang Pu
Yueting Zhuang
OOD
AAML
224
294
0
14 Mar 2020
Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog
Shen Gao
Preslav Nakov
Chang Liu
Li Liu
Dongyan Zhao
Rui Yan
90
34
0
10 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect
Xu Yang
Hanwang Zhang
Jianfei Cai
CML
79
127
0
09 Mar 2020
PathVQA: 30000+ Questions for Medical Visual Question Answering
Xuehai He
Yichen Zhang
Luntian Mou
Eric Xing
P. Xie
LM&MA
80
246
0
07 Mar 2020
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in Natural Language Inference
Tianyu Liu
Xin Zheng
Baobao Chang
Zhifang Sui
128
24
0
05 Mar 2020
A Study on Multimodal and Interactive Explanations for Visual Question Answering
Kamran Alipour
J. Schulze
Yi Yao
Avi Ziskind
Giedrius Burachas
64
27
0
01 Mar 2020
Visual Commonsense R-CNN
Tan Wang
Jianqiang Huang
Hanwang Zhang
Qianru Sun
SSL
ObjD
CML
86
252
0
27 Feb 2020
Unshuffling Data for Improved Generalization
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OOD
77
78
0
27 Feb 2020
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
Xinyu Wang
Yuliang Liu
Chunhua Shen
Chun Chet Ng
Canjie Luo
Lianwen Jin
C. Chan
Anton Van Den Hengel
Liangwei Wang
101
97
0
24 Feb 2020
VQA-LOL: Visual Question Answering under the Lens of Logic
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
CoGe
78
75
0
19 Feb 2020
Sparse and Structured Visual Attention
Pedro Henrique Martins
S. Becker
Zita Marinho
Michael Arens
81
8
0
13 Feb 2020
Component Analysis for Visual Question Answering Architectures
Camila Kolling
Jonatas Wehrmann
Rodrigo C. Barros
CoGe
41
2
0
12 Feb 2020
Adversarial Filters of Dataset Biases
Ronan Le Bras
Swabha Swayamdipta
Chandra Bhagavatula
Rowan Zellers
Matthew E. Peters
Ashish Sabharwal
Yejin Choi
176
223
0
10 Feb 2020
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog
Zekang Li
Zongjia Li
Jinchao Zhang
Yang Feng
Cheng Niu
Jie Zhou
143
37
0
01 Feb 2020
Uncertainty based Class Activation Maps for Visual Question Answering
Badri N. Patro
Mayank Lunayach
Vinay P. Namboodiri
FAtt
UQCV
44
1
0
23 Jan 2020
Robust Explanations for Visual Question Answering
Badri N. Patro
Shivansh Pate
Vinay P. Namboodiri
OOD
AAML
73
19
0
23 Jan 2020
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
Darryl Hannan
Akshay Jain
Joey Tianyi Zhou
AAML
93
60
0
22 Jan 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
81
18
0
20 Jan 2020
SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions
Ramprasaath R. Selvaraju
Purva Tendulkar
Devi Parikh
Eric Horvitz
Marco Tulio Ribeiro
Besmira Nushi
Ece Kamar
LRM
57
14
0
20 Jan 2020
Show, Recall, and Tell: Image Captioning with Recall Mechanism
Li Wang
Zechen Bai
Yonghua Zhang
Hongtao Lu
77
67
0
15 Jan 2020
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OOD
ObjD
100
320
0
10 Jan 2020
Visual Question Answering on 360° Images
Shih-Han Chou
Wei-Lun Chao
Wei-Sheng Lai
Min Sun
Ming-Hsuan Yang
54
22
0
10 Jan 2020
Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering
Lei Shi
Shijie Geng
Kai Shuang
Chiori Hori
Songxiang Liu
Peng Gao
Sen Su
88
11
0
03 Jan 2020
All-in-One Image-Grounded Conversational Agents
Da Ju
Kurt Shuster
Y-Lan Boureau
Jason Weston
LLMAG
87
8
0
28 Dec 2019
A Review on Intelligent Object Perception Methods Combining Knowledge-based Reasoning and Machine Learning
Filippos Gouidis
Alexandros Vassiliades
Theodore Patkos
Antonis Argyros
Nick Bassiliades
Dimitris Plexousakis
OCL
73
12
0
26 Dec 2019
Smart Home Appliances: Chat with Your Fridge
Denis A. Gudovskiy
Gyuri Han
Takuya Yamaguchi
Sotaro Tsukizawa
LRM
31
4
0
19 Dec 2019
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
36
4
0
19 Dec 2019
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
Vedika Agarwal
Rakshith Shetty
Mario Fritz
CML
AAML
95
159
0
16 Dec 2019
Knowledge-based Conversational Search
Svitlana Vakulenko
64
13
0
14 Dec 2019
Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
60
15
0
06 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLM
ObjD
150
481
0
05 Dec 2019
Deep Bayesian Active Learning for Multiple Correct Outputs
Khaled Jedoui
Ranjay Krishna
Michael S. Bernstein
Li Fei-Fei
BDL
OOD
UQCV
96
14
0
02 Dec 2019
Exposing and Correcting the Gender Bias in Image Captioning Datasets and Models
Shruti Bhargava
David A. Forsyth
FaML
77
50
0
02 Dec 2019
A Free Lunch in Generating Datasets: Building a VQG and VQA System with Attention and Humans in the Loop
Jihyeon Janel Lee
S. Arora
29
1
0
30 Nov 2019
Previous
1
2
3
...
35
36
37
...
39
40
41
Next