Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00468
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
VQA: Visual Question Answering
3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VQA: Visual Question Answering"
50 / 2,957 papers shown
Title
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering
V. Kazemi
Ali Elqursh
OOD
89
185
0
11 Apr 2017
Pay Attention to Those Sets! Learning Quantification from Images
Ionut-Teodor Sorodoc
Sandro Pezzelle
Aurélie Herbelot
Mariella Dimiccoli
Raffaella Bernardi
37
0
0
10 Apr 2017
An Empirical Evaluation of Visual Question Answering for Novel Objects
Santhosh Kumar Ramakrishnan
Ambar Pal
Gaurav Sharma
Anurag Mittal
OOD
98
32
0
08 Apr 2017
It Takes Two to Tango: Towards Theory of AI's Mind
Arjun Chandrasekaran
Deshraj Yadav
Prithvijit Chattopadhyay
Viraj Prabhu
Devi Parikh
115
55
0
03 Apr 2017
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks
Tanmay Gupta
Kevin J. Shih
Saurabh Singh
Derek Hoiem
110
26
0
02 Apr 2017
Towards Building Large Scale Multimodal Domain-Aware Conversation Systems
Amrita Saha
Mitesh Khapra
Karthik Sankaranarayanan
86
8
0
01 Apr 2017
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
Albert Gatt
E. Krahmer
LM&MA
ELM
153
828
0
29 Mar 2017
A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment
Haonan Yu
Haichao Zhang
Wenyuan Xu
LM&Ro
82
25
0
28 Mar 2017
An Analysis of Visual Question Answering Algorithms
Kushal Kafle
Christopher Kanan
87
234
0
28 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation
Chenxi Liu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Alan Yuille
EgoV
94
241
0
23 Mar 2017
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning
Abhishek Das
Satwik Kottur
J. M. F. Moura
Stefan Lee
Dhruv Batra
OffRL
157
425
0
20 Mar 2017
VQABQ: Visual Question Answering by Basic Questions
Jia-Hong Huang
Modar Alfadly
Guohao Li
58
25
0
19 Mar 2017
Recurrent Models for Situation Recognition
Arun Mallya
Svetlana Lazebnik
80
30
0
18 Mar 2017
End-to-end optimization of goal-driven and visually grounded dialogue systems
Florian Strub
H. D. Vries
Jérémie Mary
Bilal Piot
Aaron Courville
Olivier Pietquin
OffRL
83
138
0
15 Mar 2017
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection
Xiaodan Liang
Lisa Lee
Eric Xing
88
252
0
08 Mar 2017
Asymmetric Tri-training for Unsupervised Domain Adaptation
Kuniaki Saito
Yoshitaka Ushiku
Tatsuya Harada
149
587
0
27 Feb 2017
Visual Translation Embedding Network for Visual Relation Detection
Hanwang Zhang
Zawlin Kyaw
Shih-Fu Chang
Tat-Seng Chua
ViT
249
563
0
27 Feb 2017
ViP-CNN: Visual Phrase Guided Convolutional Neural Network
Yikang Li
Wanli Ouyang
Xiaogang Wang
Xiaoóu Tang
ObjD
69
48
0
23 Feb 2017
Task-driven Visual Saliency and Attention-based Visual Question Answering
Yuetan Lin
Zhangyang Pang
Donghui Wang
Yueting Zhuang
61
26
0
22 Feb 2017
Person Search with Natural Language Description
Shuang Li
Tong Xiao
Hongsheng Li
Bolei Zhou
Dayu Yue
Xiaogang Wang
105
396
0
19 Feb 2017
Gated Multimodal Units for Information Fusion
John Arevalo
Thamar Solorio
Manuel Montes-y-Gómez
Fabio Gonzalez
106
382
0
07 Feb 2017
Living a discrete life in a continuous world: Reference with distributed representations
Gemma Boleda
Sebastian Padó
N. Pham
Marco Baroni
20
0
0
06 Feb 2017
Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation
N. Mostafazadeh
Chris Brockett
W. Dolan
Michel Galley
Jianfeng Gao
Georgios P. Spithourakis
Lucy Vanderwende
102
183
0
28 Jan 2017
Context-aware Captions from Context-agnostic Supervision
Ramakrishna Vedantam
Samy Bengio
Kevin Patrick Murphy
Devi Parikh
Gal Chechik
96
152
0
11 Jan 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
Licheng Yu
Hao Tan
Joey Tianyi Zhou
Tamara L. Berg
ObjD
98
275
0
30 Dec 2016
Learning Visual N-Grams from Web Data
Ang Li
Allan Jabri
Armand Joulin
Laurens van der Maaten
VLM
85
138
0
29 Dec 2016
Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task
Nan Ding
Sebastian Goodman
Fei Sha
Radu Soricut
VLM
80
9
0
22 Dec 2016
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
357
2,394
0
20 Dec 2016
Automatic Generation of Grounded Visual Questions
Shijie Zhang
Zhuang Li
Shaodi You
Zhenglu Yang
Jiawan Zhang
OOD
79
79
0
20 Dec 2016
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions
Peng Wang
Qi Wu
Chunhua Shen
Anton Van Den Hengel
OOD
90
86
0
16 Dec 2016
Attentive Explanations: Justifying Decisions and Pointing to the Evidence
Dong Huk Park
Lisa Anne Hendricks
Zeynep Akata
Bernt Schiele
Trevor Darrell
Marcus Rohrbach
AAML
85
79
0
14 Dec 2016
Learning to Hash-tag Videos with Tag2Vec
A. Singh
Saurabh Saini
R. Shah
P. J. Narayanan
32
1
0
13 Dec 2016
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering
Marc Bolaños
Álvaro Peris
F. Casacuberta
Petia Radeva
68
6
0
12 Dec 2016
MarioQA: Answering Questions by Watching Gameplay Videos
Jonghwan Mun
Paul Hongsuck Seo
Ilchae Jung
Bohyung Han
97
110
0
06 Dec 2016
ImageNet pre-trained models with batch normalization
Marcel Simon
E. Rodner
Joachim Denzler
VLM
SSeg
104
166
0
05 Dec 2016
Deep Multi-Modal Image Correspondence Learning
Chen Liu
Jiajun Wu
Pushmeet Kohli
Yasutaka Furukawa
28
5
0
05 Dec 2016
Who is Mistaken?
Benjamin Eysenbach
Carl Vondrick
Antonio Torralba
105
17
0
04 Dec 2016
Commonly Uncommon: Semantic Sparsity in Situation Recognition
Mark Yatskar
Vicente Ordonez
Luke Zettlemoyer
Ali Farhadi
VLM
73
42
0
03 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
397
3,275
0
02 Dec 2016
Visual Dialog
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
163
1,005
0
26 Nov 2016
GuessWhat?! Visual object discovery through multi-modal dialogue
H. D. Vries
Florian Strub
A. Chandar
Olivier Pietquin
Hugo Larochelle
Aaron Courville
VLM
110
428
0
23 Nov 2016
A dataset and exploration of models for understanding video data through fill-in-the-blank question-answering
Tegan Maharaj
Nicolas Ballas
Anna Rohrbach
Aaron Courville
C. Pal
VGen
80
108
0
23 Nov 2016
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li Li
VLM
103
170
0
21 Nov 2016
Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues
Bryan A. Plummer
Arun Mallya
Christopher M. Cervantes
Julia Hockenmaier
Svetlana Lazebnik
139
189
0
21 Nov 2016
Recurrent Memory Addressing for describing videos
A. Jain
Abhinav Agarwalla
Kumar Krishna Agrawal
Pabitra Mitra
62
10
0
20 Nov 2016
Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic
Somak Aditya
Yezhou Yang
Chitta Baral
Yiannis Aloimonos
ReLM
21
4
0
17 Nov 2016
Nothing Else Matters: Model-Agnostic Explanations By Identifying Prediction Invariance
Marco Tulio Ribeiro
Sameer Singh
Carlos Guestrin
FAtt
68
64
0
17 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
84
1,666
0
17 Nov 2016
Zero-Shot Visual Question Answering
Damien Teney
Anton Van Den Hengel
90
74
0
17 Nov 2016
The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives
Mohit Iyyer
Varun Manjunatha
Anupam Guha
Yogarshi Vyas
Jordan L. Boyd-Graber
Hal Daumé
L. Davis
85
100
0
16 Nov 2016
Previous
1
2
3
...
56
57
58
59
60
Next