Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00468
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
VQA: Visual Question Answering
3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VQA: Visual Question Answering"
50 / 2,957 papers shown
Title
Vispi: Automatic Visual Perception and Interpretation of Chest X-rays
X. Li
Rui Cao
D. Zhu
79
20
0
12 Jun 2019
Relationship-Embedded Representation Learning for Grounding Referring Expressions
Sibei Yang
Guanbin Li
Yizhou Yu
ObjD
93
55
0
11 Jun 2019
Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering
Claudio Greco
Barbara Plank
Raquel Fernández
Raffaella Bernardi
CLL
KELM
72
50
0
10 Jun 2019
Multimodal Logical Inference System for Visual-Textual Entailment
Riko Suzuki
Hitomi Yanaka
Masashi Yoshikawa
K. Mineshima
D. Bekki
NAI
81
17
0
10 Jun 2019
A Survey of Reinforcement Learning Informed by Natural Language
Jelena Luketina
Nantas Nardelli
Gregory Farquhar
Jakob N. Foerster
Jacob Andreas
Edward Grefenstette
Shimon Whiteson
Tim Rocktaschel
LM&Ro
KELM
OffRL
LRM
110
282
0
10 Jun 2019
Joint Visual Grounding with Language Scene Graphs
Daqing Liu
Hanwang Zhang
Zhengjun Zha
Meng Wang
Qianru Sun
71
6
0
09 Jun 2019
Adversarial Mahalanobis Distance-based Attentive Song Recommender for Automatic Playlist Continuation
Thanh-Binh Tran
Renee Sweeney
Kyumin Lee
70
32
0
08 Jun 2019
Figure Captioning with Reasoning and Sequence-Level Training
Charles C. Chen
Ruiyi Zhang
Eunyee Koh
Sungchul Kim
Scott D. Cohen
Tong Yu
Ryan Rossi
Razvan Bunescu
AIMat
69
39
0
07 Jun 2019
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
Zhou Yu
D. Xu
Jun-chen Yu
Ting Yu
Zhou Zhao
Yueting Zhuang
Dacheng Tao
146
478
0
06 Jun 2019
Context-Aware Visual Policy Network for Fine-Grained Image Captioning
Zhengjun Zha
Daqing Liu
Hanwang Zhang
Yongdong Zhang
Feng Wu
66
122
0
06 Jun 2019
Explain Yourself! Leveraging Language Models for Commonsense Reasoning
Nazneen Rajani
Bryan McCann
Caiming Xiong
R. Socher
ReLM
LRM
130
566
0
06 Jun 2019
Learning to Compose and Reason with Language Tree Structures for Visual Grounding
Richang Hong
Daqing Liu
Xiaoyu Mo
Xiangnan He
Hanwang Zhang
ReLM
LRM
98
166
0
05 Jun 2019
The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue
J. Haber
Tim Baumgärtner
Ece Takmaz
Lieke Gelderloos
Elia Bruni
Raquel Fernández
64
77
0
04 Jun 2019
Generating Question Relevant Captions to Aid Visual Question Answering
Jialin Wu
Zeyuan Hu
Raymond J. Mooney
121
43
0
03 Jun 2019
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
Kenneth Marino
Mohammad Rastegari
Ali Farhadi
Roozbeh Mottaghi
192
1,095
0
31 May 2019
Visual Understanding and Narration: A Deeper Understanding and Explanation of Visual Scenes
S. Lukin
C. Bonial
Clare R. Voss
10
2
0
31 May 2019
Scene Text Visual Question Answering
Ali Furkan Biten
Rubèn Pérez Tito
Andrés Mafla
Lluís Gómez
Marçal Rusiñol
Ernest Valveny
C. V. Jawahar
Dimosthenis Karatzas
153
361
0
31 May 2019
What Can Neural Networks Reason About?
Keyulu Xu
Jingling Li
Mozhi Zhang
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
NAI
AI4CE
112
248
0
30 May 2019
Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback
Hui Wu
Yupeng Gao
Xiaoxiao Guo
Ziad Al-Halah
Steven J. Rennie
Kristen Grauman
Rogerio Feris
EgoV
156
68
0
30 May 2019
What Makes Training Multi-Modal Classification Networks Hard?
Weiyao Wang
Du Tran
Matt Feiszli
182
453
0
29 May 2019
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation
Vihan Jain
Gabriel Ilharco
Alexander Ku
Ashish Vaswani
Eugene Ie
Jason Baldridge
LM&Ro
92
182
0
29 May 2019
Vision-to-Language Tasks Based on Attributes and Attention Mechanism
Xuelong Li
Aihong Yuan
Xiaoqiang Lu
77
37
0
29 May 2019
Leveraging Medical Visual Question Answering with Supporting Facts
Tomasz Kornuta
Deepta Rajan
Chaitanya P. Shivade
Alexis Asseman
A. Ozcan
51
16
0
28 May 2019
Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning
Wonjae Kim
Yoonho Lee
33
6
0
28 May 2019
Gaining Extra Supervision via Multi-task learning for Multi-Modal Video Question Answering
Junyeong Kim
Minuk Ma
Kyungsu Kim
Sungjin Kim
Chang D. Yoo
62
27
0
28 May 2019
Semantic Fisher Scores for Task Transfer: Using Objects to Classify Scenes
Mandar Dixit
Yunsheng Li
Nuno Vasconcelos
81
14
0
27 May 2019
Structure Learning for Neural Module Networks
Vardaan Pahuja
Jie Fu
Sarath Chandar
C. Pal
69
7
0
27 May 2019
Deep Reason: A Strong Baseline for Real-World Visual Reasoning
Chenfei Wu
Yanzhao Zhou
Gen Li
Nan Duan
Duyu Tang
Xiaojie Wang
LRM
NAI
ReLM
27
2
0
24 May 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
385
1,563
0
24 May 2019
Self-Critical Reasoning for Robust Visual Question Answering
Jialin Wu
Raymond J. Mooney
OOD
NAI
77
161
0
24 May 2019
AttentionRNN: A Structured Spatial Attention Mechanism
Siddhesh Khandelwal
Leonid Sigal
67
3
0
22 May 2019
Recent Advances in Neural Question Generation
Liangming Pan
Wenqiang Lei
Tat-Seng Chua
Min-Yen Kan
75
120
0
22 May 2019
Deep Unified Multimodal Embeddings for Understanding both Content and Users in Social Media Networks
Karan Sikka
Lucas Van Bramer
Ajay Divakaran
94
2
0
17 May 2019
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations
Fenglin Liu
Yuanxin Liu
Xuancheng Ren
Xiaodong He
Xu Sun
VLM
71
82
0
15 May 2019
Quantifying and Alleviating the Language Prior Problem in Visual Question Answering
Yangyang Guo
Zhiyong Cheng
Liqiang Nie
Yebin Liu
Yinglong Wang
Mohan Kankanhalli
57
37
0
13 May 2019
Language-Conditioned Graph Networks for Relational Reasoning
Ronghang Hu
Anna Rohrbach
Trevor Darrell
Kate Saenko
85
175
0
10 May 2019
Exact Adversarial Attack to Image Captioning via Structured Output Learning with Latent Variables
Yan Xu
Baoyuan Wu
Fumin Shen
Yanbo Fan
Yong Zhang
Heng Tao Shen
Wei Liu
AAML
78
56
0
10 May 2019
Towards Efficient Model Compression via Learned Global Ranking
Ting-Wu Chin
Ruizhou Ding
Cha Zhang
Diana Marculescu
83
172
0
28 Apr 2019
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
Jiayuan Mao
Chuang Gan
Pushmeet Kohli
J. Tenenbaum
Jiajun Wu
NAI
171
706
0
26 Apr 2019
TVQA+: Spatio-Temporal Grounding for Video Question Answering
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
75
230
0
25 Apr 2019
Understanding Art through Multi-Modal Retrieval in Paintings
Noa Garcia
B. Renoust
Yuta Nakashima
26
4
0
24 Apr 2019
Saliency-Guided Attention Network for Image-Sentence Matching
Zhong Ji
Haoran Wang
Jiawei Han
Yanwei Pang
69
89
0
20 Apr 2019
Challenges and Prospects in Vision and Language Research
Kushal Kafle
Robik Shrestha
Christopher Kanan
69
41
0
19 Apr 2019
Emergence of Compositional Language with Deep Generational Transmission
Michael Cogswell
Jiasen Lu
Stefan Lee
Devi Parikh
Dhruv Batra
117
49
0
19 Apr 2019
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
179
1,257
0
18 Apr 2019
Progressive Attention Memory Network for Movie Story Question Answering
Junyeong Kim
Minuk Ma
Kyungsu Kim
Sungjin Kim
Chang D. Yoo
119
76
0
18 Apr 2019
Question Guided Modular Routing Networks for Visual Question Answering
Yanze Wu
Qiang Sun
Jianqi Ma
Bin Li
Yanwei Fu
Yao Peng
Xiangyang Xue
69
1
0
17 Apr 2019
Objects as Points
Xingyi Zhou
Dequan Wang
Philipp Krahenbuhl
3DPC
138
3,265
0
16 Apr 2019
Evaluating the Representational Hub of Language and Vision Models
Ravi Shekhar
Ece Takmaz
Raquel Fernández
Raffaella Bernardi
84
11
0
12 Apr 2019
Factor Graph Attention
Idan Schwartz
Seunghak Yu
Tamir Hazan
Alex Schwing
130
110
0
11 Apr 2019
Previous
1
2
3
...
47
48
49
...
58
59
60
Next