Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1505.00468
Cited By
v1
v2
v3
v4
v5
v6
v7 (latest)
VQA: Visual Question Answering
3 May 2015
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VQA: Visual Question Answering"
50 / 2,957 papers shown
Title
VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering
Cătălina Cangea
Eugene Belilovsky
Pietro Lio
Aaron Courville
116
17
0
14 Aug 2019
3-D Scene Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents
Ue-Hwan Kim
Jin-Man Park
Taek-jin Song
Jong-hwan Kim
3DV
81
108
0
14 Aug 2019
Why Does a Visual Question Have Different Answers?
Nilavra Bhattacharya
Qing Li
Danna Gurari
66
66
0
12 Aug 2019
Multimodal Unified Attention Networks for Vision-and-Language Interactions
Zhou Yu
Yuhao Cui
Jun Yu
Dacheng Tao
Q. Tian
107
38
0
12 Aug 2019
Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking
Tan Wang
Xing Xu
Yang Yang
Alan Hanjalic
Heng Tao Shen
Jingkuan Song
58
150
0
12 Aug 2019
Multi-modality Latent Interaction Network for Visual Question Answering
Peng Gao
Haoxuan You
Zhanpeng Zhang
Xiaogang Wang
Hongsheng Li
69
82
0
10 Aug 2019
VisualBERT: A Simple and Performant Baseline for Vision and Language
Liunian Harold Li
Mark Yatskar
Da Yin
Cho-Jui Hsieh
Kai-Wei Chang
VLM
255
1,975
0
09 Aug 2019
Transferable Representation Learning in Vision-and-Language Navigation
Haoshuo Huang
Vihan Jain
Harsh Mehta
Alexander Ku
Gabriel Ilharco
Jason Baldridge
Eugene Ie
LM&Ro
79
89
0
09 Aug 2019
Recognizing Part Attributes with Insufficient Data
Xiangyu Zhao
Yi Yang
Feng Zhou
Xiao Tan
Yuchen Yuan
Sid Ying-Ze Bao
Ying Nian Wu
51
20
0
09 Aug 2019
Question-Agnostic Attention for Visual Question Answering
M. Farazi
Salman H Khan
Nick Barnes
22
10
0
09 Aug 2019
CRIC: A VQA Dataset for Compositional Reasoning on Vision and Commonsense
Difei Gao
Ruiping Wang
Shiguang Shan
Xilin Chen
CoGe
LRM
129
28
0
08 Aug 2019
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu
Dhruv Batra
Devi Parikh
Stefan Lee
SSL
VLM
323
3,718
0
06 Aug 2019
Logic could be learned from images
Q. Guo
Y. Qian
Xinyan Liang
Yanhong She
Deyu Li
Jiye Liang
NAI
27
4
0
06 Aug 2019
Answering Questions about Data Visualizations using Efficient Bimodal Fusion
Kushal Kafle
Robik Shrestha
Brian L. Price
Scott D. Cohen
Christopher Kanan
74
60
0
05 Aug 2019
Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models
Daniel Omeiza
Skyler Speakman
C. Cintas
Komminist Weldemariam
FAtt
72
219
0
03 Aug 2019
An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation
Vincent Michalski
Vikram S. Voleti
Samira Ebrahimi Kahou
Anthony Ortiz
Pascal Vincent
C. Pal
Doina Precup
BDL
52
6
0
31 Jul 2019
Learning Question-Guided Video Representation for Multi-Turn Video Question Answering
Guan-Lin Chao
Abhinav Rastogi
Semih Yavuz
Dilek Z. Hakkani-Tür
Jindong Chen
Ian Lane
51
6
0
31 Jul 2019
LEAF-QA: Locate, Encode & Attend for Figure Question Answering
Ritwick Chaudhry
Sumit Shekhar
Utkarsh Gupta
Pranav Maneriker
Prann Bansal
Ajay Joshi
LMTD
55
89
0
30 Jul 2019
Modulation of early visual processing alleviates capacity limits in solving multiple tasks
Sushrut Thorat
G. Aldegheri
Marcel van Gerven
M. Peelen
MoE
28
3
0
29 Jul 2019
V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices
Damien Teney
Peng Wang
Jiewei Cao
Lingqiao Liu
Chunhua Shen
Anton Van Den Hengel
65
31
0
29 Jul 2019
An Empirical Study on Leveraging Scene Graphs for Visual Question Answering
Cheng Zhang
Wei-Lun Chao
D. Xuan
77
51
0
28 Jul 2019
Learning Goal-Oriented Visual Dialog Agents: Imitating and Surpassing Analytic Experts
Yenchih Chang
Wen-Hsiao Peng
30
4
0
24 Jul 2019
Bilinear Graph Networks for Visual Question Answering
Dalu Guo
Chang Xu
Dacheng Tao
GNN
86
54
0
23 Jul 2019
Position Focused Attention Network for Image-Text Matching
Yaxiong Wang
Hao-Hsiang Yang
Xueming Qian
Lin Ma
Jing Lu
Biao Li
Xin Fan
54
172
0
23 Jul 2019
Highlight Every Step: Knowledge Distillation via Collaborative Teaching
Haoran Zhao
Changrui Chen
Junyu Dong
Xin Sun
Zihe Dong
80
59
0
23 Jul 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
141
136
0
22 Jul 2019
Why Build an Assistant in Minecraft?
Arthur Szlam
Jonathan Gray
Kavya Srinet
Yacine Jernite
Armand Joulin
...
Siddharth Goyal
Demi Guo
Dan Rothermel
C. L. Zitnick
Jason Weston
LLMAG
150
29
0
22 Jul 2019
CraftAssist: A Framework for Dialogue-enabled Interactive Agents
Jonathan Gray
Kavya Srinet
Yacine Jernite
Haonan Yu
Zhuoyuan Chen
Demi Guo
Siddharth Goyal
C. L. Zitnick
Arthur Szlam
78
39
0
19 Jul 2019
A Survey on Explainable Artificial Intelligence (XAI): Towards Medical XAI
Erico Tjoa
Cuntai Guan
XAI
170
1,467
0
17 Jul 2019
2nd Place Solution to the GQA Challenge 2019
Shijie Geng
Ji Zhang
Hang Zhang
Ahmed Elgammal
Dimitris N. Metaxas
ReLM
32
5
0
16 Jul 2019
Concept-Centric Visual Turing Tests for Method Validation
Tatiana Fountoukidou
Raphael Sznitman
52
2
0
15 Jul 2019
Composing Neural Learning and Symbolic Reasoning with an Application to Visual Discrimination
Adithya Murali
Atharva Sehgal
Paul Krogmeier
P. Madhusudan
21
4
0
12 Jul 2019
Vision-and-Dialog Navigation
Jesse Thomason
Michael Murray
Maya Cakmak
Luke Zettlemoyer
LM&Ro
102
331
0
10 Jul 2019
Learning by Abstraction: The Neural State Machine
Drew A. Hudson
Christopher D. Manning
NAI
OCL
131
262
0
09 Jul 2019
Variational Context: Exploiting Visual and Textual Context for Grounding Referring Expressions
Yulei Niu
Hanwang Zhang
Zhiwu Lu
Shih-Fu Chang
ObjD
BDL
101
26
0
08 Jul 2019
Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters
Federico Landi
Lorenzo Baraldi
M. Corsini
Rita Cucchiara
LM&Ro
98
27
0
05 Jul 2019
Kite: Automatic speech recognition for unmanned aerial vehicles
Dan Oneaţă
H. Cucu
40
13
0
02 Jul 2019
Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems
Hung Le
Doyen Sahoo
Nancy F. Chen
Guosheng Lin
63
112
0
02 Jul 2019
Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization
Paul Pu Liang
Zhun Liu
Yao-Hung Hubert Tsai
Qibin Zhao
Ruslan Salakhutdinov
Louis-Philippe Morency
AI4TS
90
84
0
01 Jul 2019
The Resale Price Prediction of Secondhand Jewelry Items Using a Multi-modal Deep Model with Iterative Co-Attention
Yusuke Yamaura
Nobuya Kanemaki
Y. Tsuboshita
66
3
0
01 Jul 2019
ICDAR 2019 Competition on Scene Text Visual Question Answering
Ali Furkan Biten
Rubèn Pérez Tito
Andrés Mafla
Lluís Gómez
Marçal Rusiñol
Minesh Mathew
C. V. Jawahar
Ernest Valveny
Dimosthenis Karatzas
74
76
0
30 Jun 2019
Open-Ended Long-Form Video Question Answering via Hierarchical Convolutional Self-Attention Networks
Zhu Zhang
Zhou Zhao
Zhijie Lin
Jingkuan Song
Xiaofei He
BDL
49
14
0
28 Jun 2019
Deep Modular Co-Attention Networks for Visual Question Answering
Zhou Yu
Jun Yu
Yuhao Cui
Dacheng Tao
Q. Tian
118
811
0
25 Jun 2019
RUBi: Reducing Unimodal Biases in Visual Question Answering
Rémi Cadène
Corentin Dancette
H. Ben-younes
Matthieu Cord
Devi Parikh
CML
104
374
0
24 Jun 2019
Investigating Biases in Textual Entailment Datasets
Shawn Tan
Songlin Yang
Chin-Wei Huang
Aaron Courville
61
8
0
23 Jun 2019
Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments
K. Niu
Y. Huang
Wanli Ouyang
Liang Wang
58
144
0
23 Jun 2019
Adversarial Regularization for Visual Question Answering: Strengths, Shortcomings, and Side Effects
Gabriel Grand
Yonatan Belinkov
111
68
0
20 Jun 2019
Expressing Visual Relationships via Language
Hao Tan
Franck Dernoncourt
Zhe Lin
Trung Bui
Joey Tianyi Zhou
90
68
0
18 Jun 2019
Improving Visual Question Answering by Referring to Generated Paragraph Captions
Hyounghun Kim
Joey Tianyi Zhou
CoGe
50
20
0
14 Jun 2019
The Replica Dataset: A Digital Replica of Indoor Spaces
Julian Straub
Thomas Whelan
Lingni Ma
Yufan Chen
Erik Wijmans
...
H. Strasdat
R. D. Nardi
Michael Goesele
S. Lovegrove
Richard Newcombe
3DV
138
865
0
13 Jun 2019
Previous
1
2
3
...
46
47
48
...
58
59
60
Next