ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1410.0210
  4. Cited By
A Multi-World Approach to Question Answering about Real-World Scenes
  based on Uncertain Input

A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

1 October 2014
Mateusz Malinowski
Mario Fritz
ArXivPDFHTML

Papers citing "A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input"

50 / 330 papers shown
Title
Adversarial Multimodal Network for Movie Question Answering
Zhaoquan Yuan
Siyuan Sun
Lixin Duan
Xiao Wu
Changsheng Xu
24
3
0
24 Jun 2019
Improving Visual Question Answering by Referring to Generated Paragraph
  Captions
Improving Visual Question Answering by Referring to Generated Paragraph Captions
Hyounghun Kim
Joey Tianyi Zhou
CoGe
19
20
0
14 Jun 2019
Figure Captioning with Reasoning and Sequence-Level Training
Figure Captioning with Reasoning and Sequence-Level Training
Charles C. Chen
Ruiyi Zhang
Eunyee Koh
Sungchul Kim
Scott D. Cohen
Tong Yu
Ryan Rossi
Razvan Bunescu
AIMat
31
38
0
07 Jun 2019
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via
  Question Answering
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
Zhou Yu
D. Xu
Jun-chen Yu
Ting Yu
Zhou Zhao
Yueting Zhuang
Dacheng Tao
24
440
0
06 Jun 2019
OK-VQA: A Visual Question Answering Benchmark Requiring External
  Knowledge
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
Kenneth Marino
Mohammad Rastegari
Ali Farhadi
Roozbeh Mottaghi
19
1,020
0
31 May 2019
Scene Text Visual Question Answering
Scene Text Visual Question Answering
Ali Furkan Biten
Rubèn Pérez Tito
Andrés Mafla
Lluís Gómez
Marçal Rusiñol
Ernest Valveny
C. V. Jawahar
Dimosthenis Karatzas
41
343
0
31 May 2019
Vision-to-Language Tasks Based on Attributes and Attention Mechanism
Vision-to-Language Tasks Based on Attributes and Attention Mechanism
Xuelong Li
Aihong Yuan
Xiaoqiang Lu
21
37
0
29 May 2019
Leveraging Medical Visual Question Answering with Supporting Facts
Leveraging Medical Visual Question Answering with Supporting Facts
Tomasz Kornuta
Deepta Rajan
Chaitanya P. Shivade
Alexis Asseman
A. Ozcan
23
16
0
28 May 2019
Structure Learning for Neural Module Networks
Structure Learning for Neural Module Networks
Vardaan Pahuja
Jie Fu
Sarath Chandar
C. Pal
21
7
0
27 May 2019
Deep Reason: A Strong Baseline for Real-World Visual Reasoning
Deep Reason: A Strong Baseline for Real-World Visual Reasoning
Chenfei Wu
Yanzhao Zhou
Gen Li
Nan Duan
Duyu Tang
Xiaojie Wang
LRM
NAI
ReLM
16
2
0
24 May 2019
Quantifying and Alleviating the Language Prior Problem in Visual
  Question Answering
Quantifying and Alleviating the Language Prior Problem in Visual Question Answering
Yangyang Guo
Zhiyong Cheng
Liqiang Nie
Yebin Liu
Yinglong Wang
Mohan Kankanhalli
22
36
0
13 May 2019
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and
  Sentences From Natural Supervision
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
Jiayuan Mao
Chuang Gan
Pushmeet Kohli
J. Tenenbaum
Jiajun Wu
NAI
19
686
0
26 Apr 2019
Towards VQA Models That Can Read
Towards VQA Models That Can Read
Amanpreet Singh
Vivek Natarajan
Meet Shah
Yu Jiang
Xinlei Chen
Dhruv Batra
Devi Parikh
Marcus Rohrbach
EgoV
15
1,136
0
18 Apr 2019
Factor Graph Attention
Factor Graph Attention
Idan Schwartz
Seunghak Yu
Tamir Hazan
Alex Schwing
30
110
0
11 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog
A Simple Baseline for Audio-Visual Scene-Aware Dialog
Idan Schwartz
Alex Schwing
Tamir Hazan
27
69
0
11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations
Reasoning Visual Dialogs with Structural and Partial Observations
Zilong Zheng
Wenguan Wang
Siyuan Qi
Song-Chun Zhu
39
117
0
11 Apr 2019
Heterogeneous Memory Enhanced Multimodal Attention Model for Video
  Question Answering
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering
Chenyou Fan
Xiaofan Zhang
Shu Zhang
Wensheng Wang
Chi Zhang
Heng-Chiao Huang
21
276
0
08 Apr 2019
What Object Should I Use? - Task Driven Object Detection
What Object Should I Use? - Task Driven Object Detection
Johann Sawatzky
Yaser Souri
C. Grund
Juergen Gall
ObjD
27
26
0
05 Apr 2019
VQD: Visual Query Detection in Natural Scenes
VQD: Visual Query Detection in Natural Scenes
Manoj Acharya
Karan Jariwala
Christopher Kanan
ObjD
24
18
0
04 Apr 2019
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning
Peixi Xiong
Huayi Zhan
Xin Eric Wang
Baivab Sinha
Ying Nian Wu
18
16
0
16 Mar 2019
Learning To Follow Directions in Street View
Learning To Follow Directions in Street View
Karl Moritz Hermann
Mateusz Malinowski
Piotr Wojciech Mirowski
Andras Banki-Horvath
Keith Anderson
R. Hadsell
SSL
29
66
0
01 Mar 2019
Answer Them All! Toward Universal Visual Question Answering Models
Answer Them All! Toward Universal Visual Question Answering Models
Robik Shrestha
Kushal Kafle
Christopher Kanan
25
82
0
01 Mar 2019
GQA: A New Dataset for Real-World Visual Reasoning and Compositional
  Question Answering
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering
Drew A. Hudson
Christopher D. Manning
CoGe
NAI
27
137
0
25 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Rémi Cadène
H. Ben-younes
Matthieu Cord
Nicolas Thome
LRM
19
272
0
25 Feb 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
56
322
0
20 Jan 2019
Memory Augmented Deep Generative models for Forecasting the Next Shot
  Location in Tennis
Memory Augmented Deep Generative models for Forecasting the Next Shot Location in Tennis
Tharindu Fernando
Simon Denman
Sridha Sridharan
Clinton Fookes
GAN
17
34
0
16 Jan 2019
Generating Diverse Programs with Instruction Conditioned Reinforced
  Adversarial Learning
Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning
Aishwarya Agrawal
Mateusz Malinowski
Felix Hill
S. M. Ali Eslami
Oriol Vinyals
Tejas D. Kulkarni
21
4
0
03 Dec 2018
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web
  Instructional Videos
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos
Shaojie Wang
Wentian Zhao
Ziyi Kou
Chenliang Xu
19
5
0
02 Dec 2018
VQA with no questions-answers training
VQA with no questions-answers training
B. Vatashsky
S. Ullman
41
12
0
20 Nov 2018
Explicit Bias Discovery in Visual Question Answering Models
Explicit Bias Discovery in Visual Question Answering Models
Varun Manjunatha
Nirat Saini
L. Davis
CML
FAtt
19
92
0
19 Nov 2018
On transfer learning using a MAC model variant
On transfer learning using a MAC model variant
Vincent Marois
T. S. Jayram
V. Albouy
Tomasz Kornuta
Younes Bouhadjar
A. Ozcan
DRL
26
9
0
15 Nov 2018
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual
  Question Answering
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering
Medhini Narasimhan
Svetlana Lazebnik
Alex Schwing
NAI
GNN
ReLM
26
11
0
01 Nov 2018
TallyQA: Answering Complex Counting Questions
TallyQA: Answering Complex Counting Questions
Manoj Acharya
Kushal Kafle
Christopher Kanan
19
112
0
29 Oct 2018
Do Explanations make VQA Models more Predictable to a Human?
Do Explanations make VQA Models more Predictable to a Human?
Arjun Chandrasekaran
Viraj Prabhu
Deshraj Yadav
Prithvijit Chattopadhyay
Devi Parikh
FAtt
92
97
0
29 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language
  Understanding
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
Kexin Yi
Jiajun Wu
Chuang Gan
Antonio Torralba
Pushmeet Kohli
J. Tenenbaum
NAI
46
599
0
04 Oct 2018
Transfer Learning via Unsupervised Task Discovery for Visual Question
  Answering
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering
Hyeonwoo Noh
Taehoon Kim
Jonghwan Mun
Bohyung Han
36
17
0
03 Oct 2018
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in
  the Evaluation of VQA
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA
Shailza Jolly
Sandro Pezzelle
T. Klein
Andreas Dengel
Moin Nabi
27
2
0
12 Sep 2018
Answering Visual What-If Questions: From Actions to Predicted Scene
  Descriptions
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions
M. Wagner
H. Basevi
Rakshith Shetty
Wenbin Li
Mateusz Malinowski
M. Fritz
A. Leonardis
27
29
0
11 Sep 2018
The Visual QA Devil in the Details: The Impact of Early Fusion and Batch
  Norm on CLEVR
The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR
Mateusz Malinowski
Carl Doersch
ReLM
19
12
0
11 Sep 2018
Exploration on Grounded Word Embedding: Matching Words and Images with
  Image-Enhanced Skip-Gram Model
Exploration on Grounded Word Embedding: Matching Words and Images with Image-Enhanced Skip-Gram Model
Ruixuan Luo
14
0
0
08 Sep 2018
TVQA: Localized, Compositional Video Question Answering
TVQA: Localized, Compositional Video Question Answering
Muhammad Abdul Wahab
Licheng Yu
Mounir Nasr Allah
Tamara L. Berg
36
617
0
05 Sep 2018
Straight to the Facts: Learning Knowledge Base Retrieval for Factual
  Visual Question Answering
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering
Medhini Narasimhan
Alex Schwing
24
105
0
04 Sep 2018
Adapting Visual Question Answering Models for Enhancing Multimodal
  Community Q&A Platforms
Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms
Avikalp Srivastava
Hsin Wen Liu
Sumio Fujita
28
3
0
29 Aug 2018
Multimodal Differential Network for Visual Question Generation
Multimodal Differential Network for Visual Question Generation
Badri N. Patro
Sandeep Kumar
V. Kurmi
Vinay P. Namboodiri
21
41
0
12 Aug 2018
A Joint Sequence Fusion Model for Video Question Answering and Retrieval
A Joint Sequence Fusion Model for Video Question Answering and Retrieval
Youngjae Yu
Jongseok Kim
Gunhee Kim
40
340
0
07 Aug 2018
Learning Visual Question Answering by Bootstrapping Hard Attention
Learning Visual Question Answering by Bootstrapping Hard Attention
Mateusz Malinowski
Carl Doersch
Adam Santoro
Peter W. Battaglia
OOD
27
96
0
01 Aug 2018
Interpretable Visual Question Answering by Visual Grounding from
  Attention Supervision Mining
Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining
Yundong Zhang
Juan Carlos Niebles
Á. Soto
35
67
0
01 Aug 2018
Pedestrian Trajectory Prediction with Structured Memory Hierarchies
Pedestrian Trajectory Prediction with Structured Memory Hierarchies
Tharindu Fernando
Simon Denman
Sridha Sridharan
Clinton Fookes
19
18
0
22 Jul 2018
Modularity Matters: Learning Invariant Relational Reasoning Tasks
Modularity Matters: Learning Invariant Relational Reasoning Tasks
Jason Jo
Vikas Verma
Yoshua Bengio
OOD
11
8
0
18 Jun 2018
Grounded Textual Entailment
Grounded Textual Entailment
H. Vu
Claudio Greco
A. Erofeeva
Somayeh Jafaritazehjan
Guido M. Linders
Marc Tanti
A. Testoni
Raffaella Bernardi
Albert Gatt
24
29
0
14 Jun 2018
Previous
1234567
Next