Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1612.00837
Cited By
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
2 December 2016
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering"
50 / 1,966 papers shown
Title
Multimodal Explanations by Predicting Counterfactuality in Videos
Atsushi Kanehira
Kentaro Takemoto
S. Inayoshi
Tatsuya Harada
26
35
0
04 Dec 2018
Multi-task Learning of Hierarchical Vision-Language Representation
Duy-Kien Nguyen
Takayuki Okatani
25
51
0
03 Dec 2018
Learning to Caption Images through a Lifetime by Asking Questions
Tingke Shen
Amlan Kar
Sanja Fidler
22
31
0
01 Dec 2018
From Known to the Unknown: Transferring Knowledge to Answer Questions about Novel Visual and Semantic Concepts
M. Farazi
Salman H Khan
Nick Barnes
23
13
0
30 Nov 2018
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
42
383
0
29 Nov 2018
From Recognition to Cognition: Visual Commonsense Reasoning
Rowan Zellers
Yonatan Bisk
Ali Farhadi
Yejin Choi
LRM
BDL
OCL
ReLM
56
867
0
27 Nov 2018
Visual Entailment Task for Visually-Grounded Language Learning
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
20
53
0
26 Nov 2018
VQA with no questions-answers training
B. Vatashsky
S. Ullman
41
12
0
20 Nov 2018
Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha
Nirat Saini
L. Davis
CML
FAtt
19
92
0
19 Nov 2018
On transfer learning using a MAC model variant
Vincent Marois
T. S. Jayram
V. Albouy
Tomasz Kornuta
Younes Bouhadjar
A. Ozcan
DRL
26
9
0
15 Nov 2018
Holistic Multi-modal Memory Network for Movie Question Answering
Anran Wang
Anh Tuan Luu
Chuan-Sheng Foo
Erik Cambria
Yi Tay
V. Chandrasekhar
36
20
0
12 Nov 2018
Shifting the Baseline: Single Modality Performance on Visual Navigation & QA
Jesse Thomason
Daniel Gordon
Yonatan Bisk
31
75
0
01 Nov 2018
A Corpus for Reasoning About Natural Language Grounded in Photographs
Alane Suhr
Stephanie Zhou
Ally Zhang
Iris Zhang
Huajun Bai
Yoav Artzi
LRM
33
589
0
01 Nov 2018
TallyQA: Answering Complex Counting Questions
Manoj Acharya
Kushal Kafle
Christopher Kanan
19
112
0
29 Oct 2018
Do Explanations make VQA Models more Predictable to a Human?
Arjun Chandrasekaran
Viraj Prabhu
Deshraj Yadav
Prithvijit Chattopadhyay
Devi Parikh
FAtt
89
97
0
29 Oct 2018
Neural Modular Control for Embodied Question Answering
Abhishek Das
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
135
127
0
26 Oct 2018
Understand, Compose and Respond - Answering Visual Questions by a Composition of Abstract Procedures
B. Vatashsky
S. Ullman
CoGe
26
1
0
25 Oct 2018
Knowing Where to Look? Analysis on Attention of Visual Question Answering System
Wei Li
Zehuan Yuan
Xiangzhong Fang
Changhu Wang
24
8
0
09 Oct 2018
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization
S. Ramakrishnan
Aishwarya Agrawal
Stefan Lee
AAML
20
235
0
08 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
46
599
0
04 Oct 2018
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
Hyeonwoo Noh
Taehoon Kim
Jonghwan Mun
Bohyung Han
31
17
0
03 Oct 2018
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA
Shailza Jolly
Sandro Pezzelle
T. Klein
Andreas Dengel
Moin Nabi
27
2
0
12 Sep 2018
How clever is the FiLM model, and how clever can it be?
A. Kuhnle
Huiyuan Xie
Ann A. Copestake
30
6
0
09 Sep 2018
What If We Simply Swap the Two Text Fragments? A Straightforward yet Effective Way to Test the Robustness of Methods to Confounding Signals in Nature Language Inference Tasks
Haohan Wang
Da-You Sun
Eric Xing
30
42
0
07 Sep 2018
Visual Coreference Resolution in Visual Dialog using Neural Module Networks
Satwik Kottur
José M. F. Moura
Devi Parikh
Dhruv Batra
Marcus Rohrbach
24
164
0
06 Sep 2018
Interpretable Visual Question Answering by Reasoning on Dependency Trees
Qingxing Cao
Bailin Li
Xiaodan Liang
Liang Lin
33
55
0
06 Sep 2018
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering
Medhini Narasimhan
A. Schwing
24
105
0
04 Sep 2018
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes
Semih Yagcioglu
Aykut Erdem
Erkut Erdem
Nazli Ikizler-Cinbis
CoGe
18
171
0
04 Sep 2018
The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers
Dongxiang Zhang
Lei Wang
Nuo Xu
B. Dai
Heng Tao Shen
ReLM
AIMat
45
126
0
22 Aug 2018
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference
Rowan Zellers
Yonatan Bisk
Roy Schwartz
Yejin Choi
36
705
0
16 Aug 2018
How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks
Divyansh Kaushik
Zachary Chase Lipton
ELM
39
231
0
14 Aug 2018
Community Regularization of Visually-Grounded Dialog
Akshat Agarwal
Swaminathan Gurumurthy
Vasu Sharma
M. Lewis
Katia P. Sycara
18
10
0
10 Aug 2018
A Joint Sequence Fusion Model for Video Question Answering and Retrieval
Youngjae Yu
Jongseok Kim
Gunhee Kim
29
340
0
07 Aug 2018
Learning Visual Question Answering by Bootstrapping Hard Attention
Mateusz Malinowski
Carl Doersch
Adam Santoro
Peter W. Battaglia
OOD
27
96
0
01 Aug 2018
Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining
Yundong Zhang
Juan Carlos Niebles
Á. Soto
35
67
0
01 Aug 2018
Pythia v0.1: the Winning Entry to the VQA Challenge 2018
Yu Jiang
Vivek Natarajan
Xinlei Chen
Marcus Rohrbach
Dhruv Batra
Devi Parikh
VLM
23
202
0
26 Jul 2018
Explainable Neural Computation via Stack Neural Module Networks
Ronghang Hu
Jacob Andreas
Trevor Darrell
Kate Saenko
LRM
OCL
30
197
0
23 Jul 2018
Question Relevance in Visual Question Answering
Prakruthi Prabhakar
Nitish Kulkarni
Linghao Zhang
19
6
0
23 Jul 2018
Dynamic Multimodal Instance Segmentation guided by natural language queries
Edgar Margffoy-Tuay
Juan C. Pérez
Emilio Botero
Pablo Arbelaez
27
170
0
06 Jul 2018
Collaborative Annotation of Semantic Objects in Images with Multi-granularity Supervisions
Lishi Zhang
Chenghan Fu
Jia Li
30
8
0
27 Jun 2018
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features
Chiori Hori
Huda AlAmri
Jue Wang
Gordon Wichern
Takaaki Hori
...
Raphael Gontijo-Lopes
Abhishek Das
Irfan Essa
Dhruv Batra
Devi Parikh
VGen
18
125
0
21 Jun 2018
Learning Conditioned Graph Structures for Interpretable Visual Question Answering
Will Norcliffe-Brown
Efstathios Vafeias
Sarah Parisot
GNN
21
236
0
19 Jun 2018
Learning Visual Knowledge Memory Networks for Visual Question Answering
Zhou Su
Chen Zhu
Yinpeng Dong
Dongqi Cai
Yurong Chen
Jianguo Li
34
62
0
13 Jun 2018
Cross-Dataset Adaptation for Visual Question Answering
Wei-Lun Chao
Hexiang Hu
Fei Sha
OOD
27
49
0
10 Jun 2018
Learning Answer Embeddings for Visual Question Answering
Hexiang Hu
Wei-Lun Chao
Fei Sha
18
33
0
10 Jun 2018
CS-VQA: Visual Question Answering with Compressively Sensed Images
Li-Chi Huang
K. Kulkarni
Anik Jha
Suhas Lohit
Suren Jayasuriya
Pavan Turaga
CoGe
29
8
0
08 Jun 2018
Visual Reasoning by Progressive Module Networks
Seung Wook Kim
Makarand Tapaswi
Sanja Fidler
ReLM
LRM
36
12
0
06 Jun 2018
Focal Visual-Text Attention for Visual Question Answering
Junwei Liang
Lu Jiang
Liangliang Cao
Li-Jia Li
Alexander G. Hauptmann
33
110
0
05 Jun 2018
On the Flip Side: Identifying Counterexamples in Visual Question Answering
Gabriel Grand
Aron Szanto
Yoon Kim
Alexander Rush
21
0
0
03 Jun 2018
Visual Referring Expression Recognition: What Do Systems Actually Learn?
Volkan Cirik
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
31
63
0
30 May 2018
Previous
1
2
3
...
37
38
39
40
Next