Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00468
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
VQA: Visual Question Answering
3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VQA: Visual Question Answering"
50 / 2,957 papers shown
Title
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Zhiyuan Fang
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
78
63
0
11 Mar 2020
MQA: Answering the Question via Robotic Manipulation
Yuhong Deng
Di Guo
F. Sun
Naifu Zhang
Huaping Liu
Chen Pang
76
22
0
10 Mar 2020
Deconfounded Image Captioning: A Causal Retrospect
Xu Yang
Hanwang Zhang
Jianfei Cai
CML
79
127
0
09 Mar 2020
Investigating the Decoders of Maximum Likelihood Sequence Models: A Look-ahead Approach
Yu-Siang Wang
Yen-Ling Kuo
Boris Katz
57
3
0
08 Mar 2020
PathVQA: 30000+ Questions for Medical Visual Question Answering
Xuehai He
Yichen Zhang
Luntian Mou
Eric Xing
P. Xie
LM&MA
77
246
0
07 Mar 2020
Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning
Elad Amrani
Rami Ben-Ari
Daniel Rotman
A. Bronstein
134
126
0
06 Mar 2020
XGPT: Cross-modal Generative Pre-Training for Image Captioning
Qiaolin Xia
Haoyang Huang
Nan Duan
Dongdong Zhang
Lei Ji
Zhifang Sui
Edward Cui
Taroon Bharti
Xin Liu
Ming Zhou
MLLM
VLM
103
76
0
03 Mar 2020
Natural Language Processing Advancements By Deep Learning: A Survey
A. Torfi
Rouzbeh A. Shirvani
Yaser Keneshloo
Nader Tavvaf
Edward A. Fox
AI4CE
VLM
151
222
0
02 Mar 2020
A Study on Multimodal and Interactive Explanations for Visual Question Answering
Kamran Alipour
J. Schulze
Yi Yao
Avi Ziskind
Giedrius Burachas
64
27
0
01 Mar 2020
Visual Commonsense R-CNN
Tan Wang
Jianqiang Huang
Hanwang Zhang
Qianru Sun
SSL
ObjD
CML
86
252
0
27 Feb 2020
Unshuffling Data for Improved Generalization
Damien Teney
Ehsan Abbasnejad
Anton Van Den Hengel
OOD
77
78
0
27 Feb 2020
What BERT Sees: Cross-Modal Transfer for Visual Question Generation
Thomas Scialom
Patrick Bordes
Paul-Alexis Dray
Jacopo Staiano
Patrick Gallinari
59
6
0
25 Feb 2020
On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering
Xinyu Wang
Yuliang Liu
Chunhua Shen
Chun Chet Ng
Canjie Luo
Lianwen Jin
C. Chan
Anton Van Den Hengel
Liangwei Wang
101
97
0
24 Feb 2020
Shallow2Deep: Indoor Scene Modeling by Single Image Understanding
Y. Nie
Shihui Guo
Jian Chang
Xiaoguang Han
Jiahui Huang
Shimin Hu
Jianjun Zhang
3DPC
3DV
154
16
0
22 Feb 2020
Interactive Natural Language-based Person Search
Vikram Shree
Wei-Lun Chao
M. Campbell
28
4
0
19 Feb 2020
VQA-LOL: Visual Question Answering under the Lens of Logic
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
CoGe
71
75
0
19 Feb 2020
CQ-VQA: Visual Question Answering on Categorized Questions
Aakansha Mishra
A. Anand
Prithwijit Guha
145
6
0
17 Feb 2020
Looking Enhances Listening: Recovering Missing Speech Using Images
Tejas Srinivasan
Ramon Sanabria
Florian Metze
59
15
0
13 Feb 2020
Component Analysis for Visual Question Answering Architectures
Camila Kolling
Jonatas Wehrmann
Rodrigo C. Barros
CoGe
36
2
0
12 Feb 2020
Object Detection as a Positive-Unlabeled Problem
Yuewei Yang
Kevin J. Liang
Lawrence Carin
82
39
0
11 Feb 2020
Solving Raven's Progressive Matrices with Neural Networks
Tao Zhuo
Mohan S. Kankanhalli
105
26
0
05 Feb 2020
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog
Zekang Li
Zongjia Li
Jinchao Zhang
Yang Feng
Cheng Niu
Jie Zhou
143
37
0
01 Feb 2020
Break It Down: A Question Understanding Benchmark
Tomer Wolfson
Mor Geva
Ankit Gupta
Matt Gardner
Yoav Goldberg
Daniel Deutch
Jonathan Berant
91
188
0
31 Jan 2020
Augmenting Visual Question Answering with Semantic Frame Information in a Multitask Learning Approach
Mehrdad Alizadeh
Barbara Di Eugenio
23
3
0
31 Jan 2020
Dual Convolutional LSTM Network for Referring Image Segmentation
Linwei Ye
Zhi Liu
Yang Wang
76
46
0
30 Jan 2020
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
C. Qi
Xinlei Chen
Or Litany
Leonidas Guibas
3DPC
254
253
0
29 Jan 2020
Deep Bayesian Network for Visual Question Generation
Badri N. Patro
V. Kurmi
Sandeep Kumar
Vinay P. Namboodiri
BDL
36
18
0
23 Jan 2020
Robust Explanations for Visual Question Answering
Badri N. Patro
Shivansh Pate
Vinay P. Namboodiri
OOD
AAML
71
19
0
23 Jan 2020
ManyModalQA: Modality Disambiguation and QA over Diverse Inputs
Darryl Hannan
Akshay Jain
Joey Tianyi Zhou
AAML
88
60
0
22 Jan 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
134
263
0
22 Jan 2020
Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models
M. Farazi
Salman H. Khan
Nick Barnes
79
18
0
20 Jan 2020
SQuINTing at VQA Models: Introspecting VQA Models with Sub-Questions
Ramprasaath R. Selvaraju
Purva Tendulkar
Devi Parikh
Eric Horvitz
Marco Tulio Ribeiro
Besmira Nushi
Ece Kamar
LRM
57
14
0
20 Jan 2020
Modality-Balanced Models for Visual Dialogue
Hyounghun Kim
Hao Tan
Joey Tianyi Zhou
61
27
0
17 Jan 2020
A "Network Pruning Network" Approach to Deep Model Compression
Vinay Kumar Verma
Pravendra Singh
Vinay P. Namboodiri
Piyush Rai
3DPC
VLM
58
8
0
15 Jan 2020
CheXplain: Enabling Physicians to Explore and UnderstandData-Driven, AI-Enabled Medical Imaging Analysis
Yao Xie
Melody Chen
David Kao
Ge Gao
Xiang Ánthony' Chen
141
131
0
15 Jan 2020
Joint Reasoning for Multi-Faceted Commonsense Knowledge
Yohan Chalier
Simon Razniewski
Gerhard Weikum
LRM
134
25
0
13 Jan 2020
Detecting depression in dyadic conversations with multimodal narratives and visualizations
Joshua Y. Kim
Greyson Y. Kim
K. Yacef
45
7
0
13 Jan 2020
In Defense of Grid Features for Visual Question Answering
Huaizu Jiang
Ishan Misra
Marcus Rohrbach
Erik Learned-Miller
Xinlei Chen
OOD
ObjD
88
320
0
10 Jan 2020
Visual Question Answering on 360° Images
Shih-Han Chou
Wei-Lun Chao
Wei-Sheng Lai
Min Sun
Ming-Hsuan Yang
52
22
0
10 Jan 2020
Think Locally, Act Globally: Federated Learning with Local and Global Representations
Paul Pu Liang
Terrance Liu
Liu Ziyin
Nicholas B. Allen
Randy P. Auerbach
David Brent
Ruslan Salakhutdinov
Louis-Philippe Morency
FedML
124
570
0
06 Jan 2020
Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering
Lei Shi
Shijie Geng
Kai Shuang
Chiori Hori
Songxiang Liu
Peng Gao
Sen Su
85
11
0
03 Jan 2020
A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects
A. Magassouba
K. Sugiura
Hisashi Kawai
81
9
0
23 Dec 2019
Exploring Context, Attention and Audio Features for Audio Visual Scene-Aware Dialog
Shachi H. Kumar
Eda Okur
Saurav Sahay
Jonathan Huang
L. Nachman
18
1
0
20 Dec 2019
Leveraging Topics and Audio Features with Multimodal Attention for Audio Visual Scene-Aware Dialog
Shachi H. Kumar
Eda Okur
Saurav Sahay
Jonathan Huang
L. Nachman
37
7
0
20 Dec 2019
Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation
Yang He
Shadi Rahimian
Bernt Schiele
Mario Fritz
MIACV
92
52
0
20 Dec 2019
Deep Exemplar Networks for VQA and VQG
Badri N. Patro
Vinay P. Namboodiri
31
4
0
19 Dec 2019
Going Beneath the Surface: Evaluating Image Captioning for Grammaticality, Truthfulness and Diversity
Huiyuan Xie
Tom Sherborne
A. Kuhnle
Ann A. Copestake
DiffM
38
9
0
19 Dec 2019
Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing
Vedika Agarwal
Rakshith Shetty
Mario Fritz
CML
AAML
93
159
0
16 Dec 2019
CLOSURE: Assessing Systematic Generalization of CLEVR Models
Dzmitry Bahdanau
H. D. Vries
Timothy J. O'Donnell
Shikhar Murty
Philippe Beaudoin
Yoshua Bengio
Aaron Courville
NAI
80
90
0
12 Dec 2019
Multimodal Self-Supervised Learning for Medical Image Analysis
Aiham Taleb
C. Lippert
T. Klein
Moin Nabi
SSL
91
98
0
11 Dec 2019
Previous
1
2
3
...
43
44
45
...
58
59
60
Next