Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1410.0210
Cited By
A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input
1 October 2014
Mateusz Malinowski
Mario Fritz
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input"
50 / 330 papers shown
Title
Adversarial Multimodal Network for Movie Question Answering
Zhaoquan Yuan
Siyuan Sun
Lixin Duan
Xiao Wu
Changsheng Xu
24
3
0
24 Jun 2019
Improving Visual Question Answering by Referring to Generated Paragraph Captions
Hyounghun Kim
Joey Tianyi Zhou
CoGe
19
20
0
14 Jun 2019
Figure Captioning with Reasoning and Sequence-Level Training
Charles C. Chen
Ruiyi Zhang
Eunyee Koh
Sungchul Kim
Scott D. Cohen
Tong Yu
Ryan Rossi
Razvan Bunescu
AIMat
31
38
0
07 Jun 2019
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
Zhou Yu
D. Xu
Jun-chen Yu
Ting Yu
Zhou Zhao
Yueting Zhuang
Dacheng Tao
24
440
0
06 Jun 2019
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
Kenneth Marino
Mohammad Rastegari
Ali Farhadi
Roozbeh Mottaghi
19
1,020
0
31 May 2019
Scene Text Visual Question Answering
Ali Furkan Biten
Rubèn Pérez Tito
Andrés Mafla
Lluís Gómez
Marçal Rusiñol
Ernest Valveny
C. V. Jawahar
Dimosthenis Karatzas
41
343
0
31 May 2019
Vision-to-Language Tasks Based on Attributes and Attention Mechanism
Xuelong Li
Aihong Yuan
Xiaoqiang Lu
21
37
0
29 May 2019
Leveraging Medical Visual Question Answering with Supporting Facts
Tomasz Kornuta
Deepta Rajan
Chaitanya P. Shivade
Alexis Asseman
A. Ozcan
23
16
0
28 May 2019
Structure Learning for Neural Module Networks
Vardaan Pahuja
Jie Fu
Sarath Chandar
C. Pal
21
7
0
27 May 2019
Deep Reason: A Strong Baseline for Real-World Visual Reasoning
Chenfei Wu
Yanzhao Zhou
Gen Li
Nan Duan
Duyu Tang
Xiaojie Wang
LRM
NAI
ReLM
16
2
0
24 May 2019
Quantifying and Alleviating the Language Prior Problem in Visual Question Answering
Yangyang Guo
Zhiyong Cheng
Liqiang Nie
Yebin Liu
Yinglong Wang
Mohan Kankanhalli
22
36
0
13 May 2019
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
Jiayuan Mao
Chuang Gan
Pushmeet Kohli
J. Tenenbaum
Jiajun Wu
NAI
19
686
0
26 Apr 2019
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
15
1,136
0
18 Apr 2019
Factor Graph Attention
Idan Schwartz
Seunghak Yu
Tamir Hazan
Alex Schwing
30
110
0
11 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
Alex Schwing
Tamir Hazan
27
69
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
39
117
0
11 Apr 2019
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering
Chenyou Fan
Xiaofan Zhang
Shu Zhang
Wensheng Wang
Chi Zhang
Heng-Chiao Huang
21
276
0
08 Apr 2019
What Object Should I Use? - Task Driven Object Detection
Johann Sawatzky
Yaser Souri
C. Grund
Juergen Gall
ObjD
27
26
0
05 Apr 2019
VQD: Visual Query Detection in Natural Scenes
Manoj Acharya
Karan Jariwala
Christopher Kanan
ObjD
24
18
0
04 Apr 2019
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning
Peixi Xiong
Huayi Zhan
Xin Eric Wang
Baivab Sinha
Ying Nian Wu
18
16
0
16 Mar 2019
Learning To Follow Directions in Street View
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
29
66
0
01 Mar 2019
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
25
82
0
01 Mar 2019
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
Drew A. Hudson
Christopher D. Manning
CoGe
NAI
27
137
0
25 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
19
272
0
25 Feb 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
56
322
0
20 Jan 2019
Memory Augmented Deep Generative models for Forecasting the Next Shot Location in Tennis
Tharindu Fernando
Simon Denman
Sridha Sridharan
Clinton Fookes
GAN
17
34
0
16 Jan 2019
Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning
Aishwarya Agrawal
Mateusz Malinowski
Felix Hill
S. M. Ali Eslami
Oriol Vinyals
Tejas D. Kulkarni
21
4
0
03 Dec 2018
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos
Shaojie Wang
Wentian Zhao
Ziyi Kou
Chenliang Xu
19
5
0
02 Dec 2018
VQA with no questions-answers training
B. Vatashsky
S. Ullman
41
12
0
20 Nov 2018
Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha
Nirat Saini
L. Davis
CML
FAtt
19
92
0
19 Nov 2018
On transfer learning using a MAC model variant
Vincent Marois
T. S. Jayram
V. Albouy
Tomasz Kornuta
Younes Bouhadjar
A. Ozcan
DRL
26
9
0
15 Nov 2018
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering
Medhini Narasimhan
Svetlana Lazebnik
Alex Schwing
NAI
GNN
ReLM
26
11
0
01 Nov 2018
TallyQA: Answering Complex Counting Questions
Manoj Acharya
Kushal Kafle
Christopher Kanan
19
112
0
29 Oct 2018
Do Explanations make VQA Models more Predictable to a Human?
Arjun Chandrasekaran
Viraj Prabhu
Deshraj Yadav
Prithvijit Chattopadhyay
Devi Parikh
FAtt
92
97
0
29 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
46
599
0
04 Oct 2018
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
Hyeonwoo Noh
Taehoon Kim
Jonghwan Mun
Bohyung Han
36
17
0
03 Oct 2018
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA
Shailza Jolly
Sandro Pezzelle
T. Klein
Andreas Dengel
Moin Nabi
27
2
0
12 Sep 2018
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions
M. Wagner
H. Basevi
Rakshith Shetty
Wenbin Li
Mateusz Malinowski
M. Fritz
A. Leonardis
27
29
0
11 Sep 2018
The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR
Mateusz Malinowski
Carl Doersch
ReLM
19
12
0
11 Sep 2018
Exploration on Grounded Word Embedding: Matching Words and Images with Image-Enhanced Skip-Gram Model
Ruixuan Luo
14
0
0
08 Sep 2018
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
36
617
0
05 Sep 2018
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering
Medhini Narasimhan
Alex Schwing
24
105
0
04 Sep 2018
Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms
Avikalp Srivastava
Hsin Wen Liu
Sumio Fujita
28
3
0
29 Aug 2018
Multimodal Differential Network for Visual Question Generation
Badri N. Patro
Sandeep Kumar
V. Kurmi
Vinay P. Namboodiri
21
41
0
12 Aug 2018
A Joint Sequence Fusion Model for Video Question Answering and Retrieval
Youngjae Yu
Jongseok Kim
Gunhee Kim
40
340
0
07 Aug 2018
Learning Visual Question Answering by Bootstrapping Hard Attention
Mateusz Malinowski
Carl Doersch
Adam Santoro
Peter W. Battaglia
OOD
27
96
0
01 Aug 2018
Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining
Yundong Zhang
Juan Carlos Niebles
Á. Soto
35
67
0
01 Aug 2018
Pedestrian Trajectory Prediction with Structured Memory Hierarchies
Tharindu Fernando
Simon Denman
Sridha Sridharan
Clinton Fookes
19
18
0
22 Jul 2018
Modularity Matters: Learning Invariant Relational Reasoning Tasks
Jason Jo
Vikas Verma
Yoshua Bengio
OOD
11
8
0
18 Jun 2018
Grounded Textual Entailment
H. Vu
Claudio Greco
A. Erofeeva
Somayeh Jafaritazehjan
Guido M. Linders
Marc Tanti
A. Testoni
Raffaella Bernardi
Albert Gatt
24
29
0
14 Jun 2018
Previous
1
2
3
4
5
6
7
Next