Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00468
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
VQA: Visual Question Answering
3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VQA: Visual Question Answering"
50 / 2,957 papers shown
Title
Sitatapatra: Blocking the Transfer of Adversarial Samples
Ilia Shumailov
Xitong Gao
Yiren Zhao
Robert D. Mullins
Ross J. Anderson
Chengzhong Xu
AAML
GAN
64
14
0
23 Jan 2019
Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey
W. Zhang
Quan Z. Sheng
A. Alhazmi
Chenliang Li
AAML
125
57
0
21 Jan 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
127
327
0
20 Jan 2019
Evaluating Text-to-Image Matching using Binary Image Selection (BISON)
Hexiang Hu
Ishan Misra
Laurens van der Maaten
89
22
0
19 Jan 2019
Toward Explainable Fashion Recommendation
Pongsate Tangseng
Takayuki Okatani
59
29
0
15 Jan 2019
Dialog System Technology Challenge 7
Koichiro Yoshino
Chiori Hori
Julien Perez
L. F. D’Haro
L. Polymenakos
...
Xiang Gao
Huda AlAmri
Tim K. Marks
Devi Parikh
Dhruv Batra
85
37
0
11 Jan 2019
Self-Monitoring Navigation Agent via Auxiliary Progress Estimation
Chih-Yao Ma
Jiasen Lu
Zuxuan Wu
G. Al-Regib
Z. Kira
R. Socher
Caiming Xiong
LM&Ro
100
279
0
10 Jan 2019
JECL: Joint Embedding and Cluster Learning for Image-Text Pairs
Sean T. Yang
Kuan-Hao Huang
Bill Howe
VLM
26
3
0
04 Jan 2019
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions
Runtao Liu
Chenxi Liu
Yutong Bai
Alan Yuille
NAI
ObjD
135
123
0
03 Jan 2019
Action2Vec: A Crossmodal Embedding Approach to Action Learning
Meera Hahn
Andrew Silva
James M. Rehg
80
58
0
02 Jan 2019
The meaning of "most" for visual question answering models
A. Kuhnle
Ann A. Copestake
38
4
0
31 Dec 2018
Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering
Zhuoqian Yang
Zengchang Qin
Jing Yu
Yue Hu
GNN
80
16
0
23 Dec 2018
Context, Attention and Audio Feature Explorations for Audio Visual Scene-Aware Dialog
Shachi H. Kumar
Eda Okur
Saurav Sahay
Juan Jose Alvarado Leanos
Jonathan Huang
L. Nachman
31
10
0
20 Dec 2018
Sequential Attention GAN for Interactive Image Editing
Yu Cheng
Zhe Gan
Yitong Li
Jingjing Liu
Jianfeng Gao
88
98
0
20 Dec 2018
Composing Text and Image for Image Retrieval - An Empirical Odyssey
Nam S. Vo
Lu Jiang
Chen Sun
Kevin Patrick Murphy
Li Li
Li Fei-Fei
James Hays
CoGe
76
370
0
18 Dec 2018
From FiLM to Video: Multi-turn Question Answering with Multi-modal Context
T. Nguyen
Shikhar Sharma
Hannes Schulz
Layla El Asri
69
33
0
17 Dec 2018
Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering
Peng Gao
Zhengkai Jiang
Haoxuan You
Pan Lu
Steven C. H. Hoi
Xiaogang Wang
Hongsheng Li
AIMat
106
368
0
13 Dec 2018
Learning Representations of Sets through Optimized Permutations
Yan Zhang
Jonathon S. Hare
Adam Prugel-Bennett
SSL
81
25
0
10 Dec 2018
Spatial Knowledge Distillation to aid Visual Reasoning
Somak Aditya
Rudra Saha
Yezhou Yang
Chitta Baral
72
15
0
10 Dec 2018
Semantically-Aware Attentive Neural Embeddings for Image-based Visual Localization
Zachary Seymour
Karan Sikka
Han-Pang Chiu
S. Samarasekera
Rakesh Kumar
63
10
0
08 Dec 2018
Recursive Visual Attention in Visual Dialog
Yulei Niu
Hanwang Zhang
Manli Zhang
Jianhong Zhang
Zhiwu Lu
Ji-Rong Wen
109
119
0
06 Dec 2018
Learning to Compose Dynamic Tree Structures for Visual Contexts
Kaihua Tang
Hanwang Zhang
Baoyuan Wu
Wenhan Luo
Wen Liu
108
505
0
05 Dec 2018
Explainable and Explicit Visual Reasoning over Scene Graphs
Jiaxin Shi
Hanwang Zhang
Juan-Zi Li
OCL
212
235
0
05 Dec 2018
A System for Automated Image Editing from Natural Language Commands
Jacqueline Brixey
R. Manuvinakurike
Nham Le
T. Lai
W. Chang
Trung Bui
32
4
0
03 Dec 2018
Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning
Aishwarya Agrawal
Mateusz Malinowski
Felix Hill
S. M. Ali Eslami
Oriol Vinyals
Tejas D. Kulkarni
67
4
0
03 Dec 2018
Multi-task Learning of Hierarchical Vision-Language Representation
Duy-Kien Nguyen
Takayuki Okatani
110
52
0
03 Dec 2018
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos
Shaojie Wang
Wentian Zhao
Ziyi Kou
Chenliang Xu
43
5
0
02 Dec 2018
Learning to Caption Images through a Lifetime by Asking Questions
Tingke Shen
Amlan Kar
Sanja Fidler
100
31
0
01 Dec 2018
From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts
M. Farazi
Salman H Khan
Nick Barnes
58
13
0
30 Nov 2018
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
124
391
0
29 Nov 2018
Visual Question Answering as Reading Comprehension
Hui Li
Peng Wang
Chunhua Shen
Anton Van Den Hengel
62
41
0
29 Nov 2018
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding
Hassan Akbari
Svebor Karaman
Surabhi Bhargava
Brian Chen
Carl Vondrick
Shih-Fu Chang
66
83
0
28 Nov 2018
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRM
BDL
OCL
ReLM
231
885
0
27 Nov 2018
Visual Entailment Task for Visually-Grounded Language Learning
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
60
53
0
26 Nov 2018
CLEAR: A Dataset for Compositional Language and Elementary Acoustic Reasoning
Jerome Abdelnour
G. Salvi
Jean Rouat
45
14
0
26 Nov 2018
A Survey of Mobile Computing for the Visually Impaired
Martin Weiss
Margaux Luck
Roger Girgis
C. Pal
Joseph Paul Cohen
49
10
0
25 Nov 2018
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation
Xin Eric Wang
Qiuyuan Huang
Asli Celikyilmaz
Jianfeng Gao
Dinghan Shen
Yuan-fang Wang
William Yang Wang
Lei Zhang
LM&Ro
SSL
147
542
0
25 Nov 2018
An Interpretable Model for Scene Graph Generation
Ji Zhang
Kevin J. Shih
Andrew Tao
Bryan Catanzaro
Ahmed Elgammal
GNN
66
22
0
21 Nov 2018
Early Fusion for Goal Directed Robotic Vision
Aaron Walsman
Yonatan Bisk
Saadia Gabriel
Dipendra Kumar Misra
Yoav Artzi
Yejin Choi
Dieter Fox
79
9
0
21 Nov 2018
VQA with no questions-answers training
B. Vatashsky
S. Ullman
108
13
0
20 Nov 2018
Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha
Nirat Saini
L. Davis
CML
FAtt
69
93
0
19 Nov 2018
RePr: Improved Training of Convolutional Filters
Aaditya (Adi) Prakash
J. Storer
D. Florêncio
Cha Zhang
VLM
CVBM
93
57
0
18 Nov 2018
On transfer learning using a MAC model variant
Vincent Marois
T. S. Jayram
V. Albouy
Tomasz Kornuta
Younes Bouhadjar
A. Ozcan
DRL
85
9
0
15 Nov 2018
Blindfold Baselines for Embodied QA
Ankesh Anand
Eugene Belilovsky
Kyle Kastner
Hugo Larochelle
Aaron Courville
104
45
0
12 Nov 2018
Holistic Multi-modal Memory Network for Movie Question Answering
Anran Wang
Anh Tuan Luu
Chuan-Sheng Foo
Erik Cambria
Yi Tay
V. Chandrasekhar
116
20
0
12 Nov 2018
Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning
Vasili Ramanishka
Yi-Ting Chen
Teruhisa Misu
Kate Saenko
103
286
0
06 Nov 2018
Semantic bottleneck for computer vision tasks
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
92
17
0
06 Nov 2018
Image Chat: Engaging Grounded Conversations
Kurt Shuster
Samuel Humeau
Antoine Bordes
Jason Weston
121
119
0
02 Nov 2018
Zero-Shot Transfer VQA Dataset
Yuanpeng Li
Yi Yang
Jianyu Wang
Wei Xu
51
9
0
02 Nov 2018
Shifting the Baseline: Single Modality Performance on Visual Navigation & QA
Jesse Thomason
Daniel Gordon
Yonatan Bisk
113
75
0
01 Nov 2018
Previous
1
2
3
...
49
50
51
...
58
59
60
Next