ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1410.0210
  4. Cited By
A Multi-World Approach to Question Answering about Real-World Scenes
  based on Uncertain Input

A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

1 October 2014
Mateusz Malinowski
Mario Fritz
ArXivPDFHTML

Papers citing "A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input"

50 / 330 papers shown
Title
VQABQ: Visual Question Answering by Basic Questions
VQABQ: Visual Question Answering by Basic Questions
Jia-Hong Huang
Modar Alfadly
Guohao Li
27
24
0
19 Mar 2017
Tree Memory Networks for Modelling Long-term Temporal Dependencies
Tree Memory Networks for Modelling Long-term Temporal Dependencies
Tharindu Fernando
Simon Denman
A. Mcfadyen
Sridha Sridharan
Clinton Fookes
26
53
0
12 Mar 2017
Task-driven Visual Saliency and Attention-based Visual Question
  Answering
Task-driven Visual Saliency and Attention-based Visual Question Answering
Yuetan Lin
Zhangyang Pang
Donghui Wang
Yueting Zhuang
35
26
0
22 Feb 2017
Image-Grounded Conversations: Multimodal Context for Natural Question
  and Response Generation
Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation
N. Mostafazadeh
Chris Brockett
W. Dolan
Michel Galley
Jianfeng Gao
Georgios P. Spithourakis
Lucy Vanderwende
26
181
0
28 Jan 2017
Context-aware Captions from Context-agnostic Supervision
Context-aware Captions from Context-agnostic Supervision
Ramakrishna Vedantam
Samy Bengio
Kevin Patrick Murphy
Devi Parikh
Gal Chechik
22
152
0
11 Jan 2017
Understanding Image and Text Simultaneously: a Dual Vision-Language
  Machine Comprehension Task
Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task
Nan Ding
Sebastian Goodman
Fei Sha
Radu Soricut
VLM
27
9
0
22 Dec 2016
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary
  Visual Reasoning
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning
Justin Johnson
B. Hariharan
Laurens van der Maaten
Li Fei-Fei
C. L. Zitnick
Ross B. Girshick
CoGe
107
2,324
0
20 Dec 2016
Automatic Generation of Grounded Visual Questions
Automatic Generation of Grounded Visual Questions
Shijie Zhang
Lizhen Qu
Shaodi You
Zhenglu Yang
Jiawan Zhang
OOD
27
79
0
20 Dec 2016
MarioQA: Answering Questions by Watching Gameplay Videos
MarioQA: Answering Questions by Watching Gameplay Videos
Jonghwan Mun
Paul Hongsuck Seo
Ilchae Jung
Bohyung Han
50
108
0
06 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in
  Visual Question Answering
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal
Tejas Khot
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
158
3,136
0
02 Dec 2016
Visual Dialog
Visual Dialog
Abhishek Das
Satwik Kottur
Khushi Gupta
Avi Singh
Deshraj Yadav
José M. F. Moura
Devi Parikh
Dhruv Batra
71
990
0
26 Nov 2016
GuessWhat?! Visual object discovery through multi-modal dialogue
GuessWhat?! Visual object discovery through multi-modal dialogue
H. D. Vries
Florian Strub
A. Chandar
Olivier Pietquin
Hugo Larochelle
Aaron Courville
VLM
50
427
0
23 Nov 2016
A dataset and exploration of models for understanding video data through
  fill-in-the-blank question-answering
A dataset and exploration of models for understanding video data through fill-in-the-blank question-answering
Tegan Maharaj
Nicolas Ballas
Anna Rohrbach
Aaron Courville
C. Pal
VGen
15
107
0
23 Nov 2016
Zero-Shot Visual Question Answering
Zero-Shot Visual Question Answering
Damien Teney
Anton Van Den Hengel
29
73
0
17 Nov 2016
Leveraging Video Descriptions to Learn Video Question Answering
Leveraging Video Descriptions to Learn Video Question Answering
Kuo-Hao Zeng
Tseng-Hung Chen
Ching-Yao Chuang
Yuan-Hong Liao
Juan Carlos Niebles
Min Sun
32
175
0
12 Nov 2016
Crowdsourcing in Computer Vision
Crowdsourcing in Computer Vision
Adriana Kovashka
Olga Russakovsky
Li Fei-Fei
Kristen Grauman
HAI
VLM
3DV
49
149
0
07 Nov 2016
Proposing Plausible Answers for Open-ended Visual Question Answering
Proposing Plausible Answers for Open-ended Visual Question Answering
Omid Bakhshandeh
Trung Bui
Zhe Lin
W. Chang
29
1
0
20 Oct 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Kushal Kafle
Christopher Kanan
OOD
33
235
0
05 Oct 2016
Tutorial on Answering Questions about Images with Deep Learning
Tutorial on Answering Questions about Images with Deep Learning
Mateusz Malinowski
Mario Fritz
VLM
37
3
0
04 Oct 2016
Graph-Structured Representations for Visual Question Answering
Graph-Structured Representations for Visual Question Answering
Damien Teney
Lingqiao Liu
Anton Van Den Hengel
GNN
NAI
40
416
0
19 Sep 2016
Measuring Machine Intelligence Through Visual Question Answering
Measuring Machine Intelligence Through Visual Question Answering
C. L. Zitnick
Aishwarya Agrawal
Stanislaw Antol
Margaret Mitchell
Dhruv Batra
Devi Parikh
27
37
0
31 Aug 2016
Visual Question: Predicting If a Crowd Will Agree on the Answer
Visual Question: Predicting If a Crowd Will Agree on the Answer
Danna Gurari
Kristen Grauman
HAI
29
2
0
29 Aug 2016
Solving Visual Madlibs with Multiple Cues
Solving Visual Madlibs with Multiple Cues
Tatiana Tommasi
Arun Mallya
Bryan A. Plummer
Svetlana Lazebnik
Alexander C. Berg
Tamara L. Berg
37
18
0
11 Aug 2016
Mean Box Pooling: A Rich Image Representation and Output Embedding for
  the Visual Madlibs Task
Mean Box Pooling: A Rich Image Representation and Output Embedding for the Visual Madlibs Task
Ashkan Mokarian
Mateusz Malinowski
Mario Fritz
27
5
0
09 Aug 2016
Visual Question Answering: A Survey of Methods and Datasets
Visual Question Answering: A Survey of Methods and Datasets
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
41
413
0
20 Jul 2016
"Show me the cup": Reference with Continuous Representations
"Show me the cup": Reference with Continuous Representations
Gemma Boleda
Sebastian Padó
Marco Baroni
26
3
0
28 Jun 2016
Analyzing the Behavior of Visual Question Answering Models
Analyzing the Behavior of Visual Question Answering Models
Aishwarya Agrawal
Dhruv Batra
Devi Parikh
41
309
0
23 Jun 2016
Semantic Parsing to Probabilistic Programs for Situated Question
  Answering
Semantic Parsing to Probabilistic Programs for Situated Question Answering
Jayant Krishnamurthy
Oyvind Tafjord
Aniruddha Kembhavi
34
24
0
22 Jun 2016
Question Relevance in VQA: Identifying Non-Visual And False-Premise
  Questions
Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions
Arijit Ray
Gordon A. Christie
Joey Tianyi Zhou
Dhruv Batra
Devi Parikh
27
56
0
21 Jun 2016
FVQA: Fact-based Visual Question Answering
FVQA: Fact-based Visual Question Answering
Peng Wang
Qi Wu
Chunhua Shen
Anton van den Hengel
A. Dick
CoGe
39
455
0
17 Jun 2016
Training Recurrent Answering Units with Joint Loss Minimization for VQA
Training Recurrent Answering Units with Joint Loss Minimization for VQA
Hyeonwoo Noh
Bohyung Han
32
71
0
12 Jun 2016
Human Attention in Visual Question Answering: Do Humans and Deep
  Networks Look at the Same Regions?
Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?
Abhishek Das
Harsh Agrawal
C. L. Zitnick
Devi Parikh
Dhruv Batra
32
465
0
11 Jun 2016
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
24
101
0
09 May 2016
Leveraging Visual Question Answering for Image-Caption Ranking
Leveraging Visual Question Answering for Image-Caption Ranking
Xiaoyu Lin
Devi Parikh
CoGe
22
83
0
04 May 2016
Learning Models for Actions and Person-Object Interactions with Transfer
  to Question Answering
Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering
Arun Mallya
Svetlana Lazebnik
39
119
0
16 Apr 2016
Visual Storytelling
Visual Storytelling
Ting-Hao 'Kenneth' Huang
Huang
Francis Ferraro
N. Mostafazadeh
Ishan Misra
...
C. L. Zitnick
Devi Parikh
Lucy Vanderwende
Michel Galley
Margaret Mitchell
VGen
22
464
0
13 Apr 2016
Attributes as Semantic Units between Natural Language and Visual
  Recognition
Attributes as Semantic Units between Natural Language and Visual Recognition
Marcus Rohrbach
VLM
22
3
0
12 Apr 2016
Resolving Language and Vision Ambiguities Together: Joint Segmentation &
  Prepositional Attachment Resolution in Captioned Scenes
Resolving Language and Vision Ambiguities Together: Joint Segmentation & Prepositional Attachment Resolution in Captioned Scenes
Gordon A. Christie
A. Laddha
Aishwarya Agrawal
Stanislaw Antol
Yash Goyal
K. Kochersberger
Dhruv Batra
28
30
0
07 Apr 2016
A Focused Dynamic Attention Model for Visual Question Answering
A Focused Dynamic Attention Model for Visual Question Answering
Ilija Ilievski
Shuicheng Yan
Jiashi Feng
28
122
0
06 Apr 2016
A Diagram Is Worth A Dozen Images
A Diagram Is Worth A Dozen Images
Aniruddha Kembhavi
M. Salvato
Eric Kolve
Minjoon Seo
Hannaneh Hajishirzi
Ali Farhadi
3DV
12
439
0
24 Mar 2016
BreakingNews: Article Annotation by Image and Text Processing
BreakingNews: Article Annotation by Image and Text Processing
Arnau Ramisa
F. Yan
Francesc Moreno-Noguer
K. Mikolajczyk
29
105
0
23 Mar 2016
Image Captioning and Visual Question Answering Based on Attributes and
  External Knowledge
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge
Qi Wu
Chunhua Shen
Anton Van Den Hengel
Peng Wang
A. Dick
27
360
0
09 Mar 2016
Dynamic Memory Networks for Visual and Textual Question Answering
Dynamic Memory Networks for Visual and Textual Question Answering
Caiming Xiong
Stephen Merity
R. Socher
34
753
0
04 Mar 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense
  Image Annotations
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna
Yuke Zhu
Oliver Groth
Justin Johnson
Kenji Hata
...
Yannis Kalantidis
Li Li
David A. Shamma
Michael S. Bernstein
Fei-Fei Li
117
5,663
0
23 Feb 2016
Contextual Media Retrieval Using Natural Language Queries
Contextual Media Retrieval Using Natural Language Queries
Sreyasi Nag Chowdhury
Mateusz Malinowski
Andreas Bulling
Mario Fritz
20
2
0
16 Feb 2016
Automatic Description Generation from Images: A Survey of Models,
  Datasets, and Evaluation Measures
Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures
Raffaella Bernardi
Ruken Cakici
Desmond Elliott
Aykut Erdem
Erkut Erdem
Nazli Ikizler-Cinbis
Frank Keller
A. Muscat
Barbara Plank
EGVM
VLM
27
363
0
15 Jan 2016
MovieQA: Understanding Stories in Movies through Question-Answering
MovieQA: Understanding Stories in Movies through Question-Answering
Makarand Tapaswi
Yukun Zhu
Rainer Stiefelhagen
Antonio Torralba
R. Urtasun
Sanja Fidler
52
736
0
09 Dec 2015
A Restricted Visual Turing Test for Deep Scene and Event Understanding
A Restricted Visual Turing Test for Deep Scene and Event Understanding
Qi
Tianfu Wu
M. Lee
Song-Chun Zhu
19
12
0
06 Dec 2015
Visual Word2Vec (vis-w2v): Learning Visually Grounded Word Embeddings
  Using Abstract Scenes
Visual Word2Vec (vis-w2v): Learning Visually Grounded Word Embeddings Using Abstract Scenes
Satwik Kottur
Ramakrishna Vedantam
José M. F. Moura
Devi Parikh
VLM
38
85
0
22 Nov 2015
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge
  from External Sources
Ask Me Anything: Free-form Visual Question Answering Based on Knowledge from External Sources
Qi Wu
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
25
370
0
22 Nov 2015
Previous
1234567
Next