Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00468
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
VQA: Visual Question Answering
3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VQA: Visual Question Answering"
50 / 2,957 papers shown
Title
Self-supervised pre-training and contrastive representation learning for multiple-choice video QA
Seonhoon Kim
Seohyeong Jeong
Eunbyul Kim
Inho Kang
Nojun Kwak
SSL
123
40
0
17 Sep 2020
Ground-truth or DAER: Selective Re-query of Secondary Information
Stephan J. Lemmer
Jason J. Corso
57
4
0
16 Sep 2020
A Visual Analytics Framework for Explaining and Diagnosing Transfer Learning Processes
Yuxin Ma
Arlen Fan
Jingrui He
A. R. Nelakurthi
Ross Maciejewski
84
25
0
15 Sep 2020
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
226
630
0
10 Sep 2020
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports
Yikuan Li
Hanyin Wang
Yuan Luo
70
67
0
03 Sep 2020
SAC: Semantic Attention Composition for Text-Conditioned Image Retrieval
Surgan Jandial
Pinkesh Badjatiya
Pranit Chawla
Ayush Chopra
Mausoom Sarkar
Balaji Krishnamurthy
99
47
0
03 Sep 2020
Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding
Long Chen
Wenbo Ma
Jun Xiao
Hanwang Zhang
Shih-Fu Chang
ObjD
80
94
0
03 Sep 2020
Cross-modal Knowledge Reasoning for Knowledge-based Visual Question Answering
Jiahao Yu
Zihao Zhu
Yujing Wang
Weifeng Zhang
Yue Hu
Jianlong Tan
74
100
0
31 Aug 2020
A Survey of Deep Active Learning
Pengzhen Ren
Yun Xiao
Xiaojun Chang
Po-Yao (Bernie) Huang
Zhihui Li
Brij B. Gupta
Xiaojiang Chen
Xin Wang
160
1,161
0
30 Aug 2020
A Dataset and Baselines for Visual Question Answering on Art
Noa Garcia
Chentao Ye
Zihua Liu
Qingtao Hu
Mayu Otani
Chenhui Chu
Yuta Nakashima
Teruko Mitamura
CoGe
57
56
0
28 Aug 2020
A Survey of Evaluation Metrics Used for NLG Systems
Ananya B. Sai
Akash Kumar Mohankumar
Mitesh M. Khapra
ELM
99
237
0
27 Aug 2020
Visual Question Answering on Image Sets
Ankan Bansal
Yuting Zhang
Rama Chellappa
CoGe
158
44
0
27 Aug 2020
Likelihood Landscapes: A Unifying Principle Behind Many Adversarial Defenses
Fu-Huei Lin
Rohit Mittapalli
Prithvijit Chattopadhyay
Daniel Bolya
Judy Hoffman
AAML
63
2
0
25 Aug 2020
Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment
Geeticka Chauhan
Ruizhi Liao
W. Wells
Jacob Andreas
Xin Wang
Seth Berkowitz
Steven Horng
Peter Szolovits
Polina Golland
MedIm
74
53
0
22 Aug 2020
Data augmentation techniques for the Video Question Answering task
Alex Falcon
Oswald Lanz
G. Serra
EgoV
47
4
0
22 Aug 2020
Linguistically-aware Attention for Reducing the Semantic-Gap in Vision-Language Tasks
K. Gouthaman
Athira M. Nambiar
K. Srinivas
Anurag Mittal
VLM
63
13
0
18 Aug 2020
Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents
Ye Zhu
Yu Wu
Yi Yang
Yan Yan
82
13
0
18 Aug 2020
DeVLBert: Learning Deconfounded Visio-Linguistic Representations
Shengyu Zhang
Tan Jiang
Tan Wang
Kun Kuang
Zhou Zhao
Jianke Zhu
Jin Yu
Hongxia Yang
Leilei Gan
OOD
81
88
0
16 Aug 2020
Graph Edit Distance Reward: Learning to Edit Scene Graph
Lichang Chen
Guosheng Lin
Shijie Wang
Qingyao Wu
57
19
0
15 Aug 2020
Weakly supervised cross-domain alignment with optimal transport
Siyang Yuan
Ke Bai
Liqun Chen
Yizhe Zhang
Chenyang Tao
Chunyuan Li
Guoyin Wang
Ricardo Henao
Lawrence Carin
OT
60
7
0
14 Aug 2020
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue
X. Jiang
Siyi Du
Zengchang Qin
Yajing Sun
Jiahao Yu
91
37
0
11 Aug 2020
Word meaning in minds and machines
Brenden M. Lake
G. Murphy
NAI
112
118
0
04 Aug 2020
Describing Textures using Natural Language
Chenyun Wu
Mikayla Timm
Subhransu Maji
3DV
58
10
0
03 Aug 2020
Eigen-CAM: Class Activation Map using Principal Components
Mohammed Bany Muhammad
M. Yeasin
80
346
0
01 Aug 2020
LEMMA: A Multi-view Dataset for Learning Multi-agent Multi-task Activities
Baoxiong Jia
Yixin Chen
Siyuan Huang
Yixin Zhu
Song-Chun Zhu
42
54
0
31 Jul 2020
Neural Language Generation: Formulation, Methods, and Evaluation
Cristina Garbacea
Qiaozhu Mei
160
30
0
31 Jul 2020
Towards Ecologically Valid Research on Language User Interfaces
H. D. Vries
Dzmitry Bahdanau
Christopher D. Manning
293
52
0
28 Jul 2020
AiR: Attention with Reasoning Capability
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
56
36
0
28 Jul 2020
REXUP: I REason, I EXtract, I UPdate with Structured Compositional Reasoning for Visual Question Answering
Siwen Luo
S. Han
Kaiyuan Sun
Josiah Poon
CoGe
LRM
ReLM
83
4
0
27 Jul 2020
Contrastive Visual-Linguistic Pretraining
Lei Shi
Kai Shuang
Shijie Geng
Peng Su
Zhengkai Jiang
Peng Gao
Zuohui Fu
Gerard de Melo
Sen Su
VLM
SSL
CLIP
105
29
0
26 Jul 2020
Spatially Aware Multimodal Transformers for TextVQA
Yash Kant
Dhruv Batra
Peter Anderson
Alex Schwing
Devi Parikh
Jiasen Lu
Harsh Agrawal
100
86
0
23 Jul 2020
SBAT: Video Captioning with Sparse Boundary-Aware Transformer
Tao Jin
Siyu Huang
Ming Chen
Yingming Li
Zhongfei Zhang
108
56
0
23 Jul 2020
Multimodal Dialogue State Tracking By QA Approach with Data Augmentation
Xiangyang Mou
Brandyn Sigouin
Ian Steenstra
Hui Su
49
9
0
20 Jul 2020
Referring Expression Comprehension: A Survey of Methods and Datasets
Yanyuan Qiao
Chaorui Deng
Qi Wu
ObjD
126
99
0
19 Jul 2020
Understanding Spatial Relations through Multiple Modalities
Soham Dan
Hangfeng He
Dan Roth
33
6
0
19 Jul 2020
Sketching Image Gist: Human-Mimetic Hierarchical Scene Graph Generation
Wenbin Wang
Ruiping Wang
Shiguang Shan
Xilin Chen
3DH
102
53
0
17 Jul 2020
Knowledge-Based Video Question Answering with Unsupervised Scene Descriptions
Noa Garcia
Yuta Nakashima
92
32
0
17 Jul 2020
Active Visual Information Gathering for Vision-Language Navigation
Hanqing Wang
Wenguan Wang
Tianmin Shu
Wei Liang
Jianbing Shen
136
73
0
15 Jul 2020
XAlgo: a Design Probe of Explaining Algorithms' Internal States via Question-Answering
Juan Rebanal
Yuqi Tang
Jordan Combitsis
Xiang Ánthony' Chen
94
3
0
14 Jul 2020
Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder
K. Gouthaman
Anurag Mittal
98
79
0
13 Jul 2020
Applying recent advances in Visual Question Answering to Record Linkage
Marko Smilevski
22
0
0
12 Jul 2020
IQ-VQA: Intelligent Visual Question Answering
Vatsal Goel
Mohit Chandak
A. Anand
Prithwijit Guha
64
5
0
08 Jul 2020
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers
Shijie Geng
Peng Gao
Moitreya Chatterjee
Chiori Hori
Jonathan Le Roux
Yongfeng Zhang
Hongsheng Li
A. Cherian
101
11
0
08 Jul 2020
Targeting the Benchmark: On Methodology in Current Natural Language Processing Research
David Schlangen
69
58
0
07 Jul 2020
What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets
Jianing Yang
Yuying Zhu
Yongxin Wang
Ruitao Yi
Amir Zadeh
Louis-Philippe Morency
56
12
0
07 Jul 2020
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training
Yingwei Pan
Yehao Li
Jianjie Luo
Jun Xu
Ting Yao
Tao Mei
100
59
0
05 Jul 2020
Modality Shifting Attention Network for Multi-modal Video Question Answering
Junyeong Kim
Minuk Ma
T. Pham
Kyungsu Kim
Chang D. Yoo
84
72
0
04 Jul 2020
Visual Question Answering as a Multi-Task Problem
A. E. Pollard
J. Shapiro
19
7
0
03 Jul 2020
Scene Graph Reasoning for Visual Question Answering
Marcel Hildebrandt
Hang Li
Rajat Koner
Volker Tresp
Stephan Günnemann
GNN
79
64
0
02 Jul 2020
The Impact of Explanations on AI Competency Prediction in VQA
Kamran Alipour
Arijit Ray
Xiaoyu Lin
J. Schulze
Yi Yao
Giedrius Burachas
54
9
0
02 Jul 2020
Previous
1
2
3
...
40
41
42
...
58
59
60
Next