A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input

1 October 2014

Mario Fritz

Papers citing "A Multi-World Approach to Question Answering about Real-World Scenes based on Uncertain Input"

50 / 330 papers shown

Title
Adversarial Multimodal Network for Movie Question Answering Zhaoquan Yuan Siyuan Sun Lixin Duan Xiao Wu Changsheng Xu 24 3 0 24 Jun 2019
Improving Visual Question Answering by Referring to Generated Paragraph Captions Hyounghun Kim Joey Tianyi Zhou CoGe 19 20 0 14 Jun 2019
Figure Captioning with Reasoning and Sequence-Level Training Charles C. Chen Ruiyi Zhang Eunyee Koh Sungchul Kim Scott D. Cohen Tong Yu Ryan Rossi Razvan Bunescu AIMat 31 38 0 07 Jun 2019
ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering Zhou Yu D. Xu Jun-chen Yu Ting Yu Zhou Zhao Yueting Zhuang Dacheng Tao 24 440 0 06 Jun 2019
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge Kenneth Marino Mohammad Rastegari Ali Farhadi Roozbeh Mottaghi 19 1,020 0 31 May 2019
Scene Text Visual Question Answering Ali Furkan Biten Rubèn Pérez Tito Andrés Mafla Lluís Gómez Marçal Rusiñol Ernest Valveny C. V. Jawahar Dimosthenis Karatzas 41 343 0 31 May 2019
Vision-to-Language Tasks Based on Attributes and Attention Mechanism Xuelong Li Aihong Yuan Xiaoqiang Lu 21 37 0 29 May 2019
Leveraging Medical Visual Question Answering with Supporting Facts Tomasz Kornuta Deepta Rajan Chaitanya P. Shivade Alexis Asseman A. Ozcan 23 16 0 28 May 2019
Structure Learning for Neural Module Networks Vardaan Pahuja Jie Fu Sarath Chandar C. Pal 21 7 0 27 May 2019
Deep Reason: A Strong Baseline for Real-World Visual Reasoning Chenfei Wu Yanzhao Zhou Gen Li Nan Duan Duyu Tang Xiaojie Wang LRM NAI ReLM 16 2 0 24 May 2019
Quantifying and Alleviating the Language Prior Problem in Visual Question Answering Yangyang Guo Zhiyong Cheng Liqiang Nie Yebin Liu Yinglong Wang Mohan Kankanhalli 22 36 0 13 May 2019
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision Jiayuan Mao Chuang Gan Pushmeet Kohli J. Tenenbaum Jiajun Wu NAI 19 686 0 26 Apr 2019
Towards VQA Models That Can Read Amanpreet Singh Vivek Natarajan Meet Shah Yu Jiang Xinlei Chen Dhruv Batra Devi Parikh Marcus Rohrbach EgoV 15 1,136 0 18 Apr 2019
Factor Graph Attention Idan Schwartz Seunghak Yu Tamir Hazan Alex Schwing 30 110 0 11 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog Idan Schwartz Alex Schwing Tamir Hazan 27 69 0 11 Apr 2019
Reasoning Visual Dialogs with Structural and Partial Observations Zilong Zheng Wenguan Wang Siyuan Qi Song-Chun Zhu 39 117 0 11 Apr 2019
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering Chenyou Fan Xiaofan Zhang Shu Zhang Wensheng Wang Chi Zhang Heng-Chiao Huang 21 276 0 08 Apr 2019
What Object Should I Use? - Task Driven Object Detection Johann Sawatzky Yaser Souri C. Grund Juergen Gall ObjD 27 26 0 05 Apr 2019
VQD: Visual Query Detection in Natural Scenes Manoj Acharya Karan Jariwala Christopher Kanan ObjD 24 18 0 04 Apr 2019
Visual Query Answering by Entity-Attribute Graph Matching and Reasoning Peixi Xiong Huayi Zhan Xin Eric Wang Baivab Sinha Ying Nian Wu 18 16 0 16 Mar 2019
Learning To Follow Directions in Street View Karl Moritz Hermann Mateusz Malinowski Piotr Wojciech Mirowski Andras Banki-Horvath Keith Anderson R. Hadsell SSL 29 66 0 01 Mar 2019
Answer Them All! Toward Universal Visual Question Answering Models Robik Shrestha Kushal Kafle Christopher Kanan 25 82 0 01 Mar 2019
GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering Drew A. Hudson Christopher D. Manning CoGe NAI 27 137 0 25 Feb 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering Rémi Cadène H. Ben-younes Matthieu Cord Nicolas Thome LRM 19 272 0 25 Feb 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding Ning Xie Farley Lai Derek Doran Asim Kadav CoGe 56 322 0 20 Jan 2019
Memory Augmented Deep Generative models for Forecasting the Next Shot Location in Tennis Tharindu Fernando Simon Denman Sridha Sridharan Clinton Fookes GAN 17 34 0 16 Jan 2019
Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning Aishwarya Agrawal Mateusz Malinowski Felix Hill S. M. Ali Eslami Oriol Vinyals Tejas D. Kulkarni 21 4 0 03 Dec 2018
How to Make a BLT Sandwich? Learning to Reason towards Understanding Web Instructional Videos Shaojie Wang Wentian Zhao Ziyi Kou Chenliang Xu 19 5 0 02 Dec 2018
VQA with no questions-answers training B. Vatashsky S. Ullman 41 12 0 20 Nov 2018
Explicit Bias Discovery in Visual Question Answering Models Varun Manjunatha Nirat Saini L. Davis CML FAtt 19 92 0 19 Nov 2018
On transfer learning using a MAC model variant Vincent Marois T. S. Jayram V. Albouy Tomasz Kornuta Younes Bouhadjar A. Ozcan DRL 26 9 0 15 Nov 2018
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering Medhini Narasimhan Svetlana Lazebnik Alex Schwing NAI GNN ReLM 26 11 0 01 Nov 2018
TallyQA: Answering Complex Counting Questions Manoj Acharya Kushal Kafle Christopher Kanan 19 112 0 29 Oct 2018
Do Explanations make VQA Models more Predictable to a Human? Arjun Chandrasekaran Viraj Prabhu Deshraj Yadav Prithvijit Chattopadhyay Devi Parikh FAtt 92 97 0 29 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding Kexin Yi Jiajun Wu Chuang Gan Antonio Torralba Pushmeet Kohli J. Tenenbaum NAI 46 599 0 04 Oct 2018
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering Hyeonwoo Noh Taehoon Kim Jonghwan Mun Bohyung Han 36 17 0 03 Oct 2018
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA Shailza Jolly Sandro Pezzelle T. Klein Andreas Dengel Moin Nabi 27 2 0 12 Sep 2018
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions M. Wagner H. Basevi Rakshith Shetty Wenbin Li Mateusz Malinowski M. Fritz A. Leonardis 27 29 0 11 Sep 2018
The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR Mateusz Malinowski Carl Doersch ReLM 19 12 0 11 Sep 2018
Exploration on Grounded Word Embedding: Matching Words and Images with Image-Enhanced Skip-Gram Model Ruixuan Luo 14 0 0 08 Sep 2018
TVQA: Localized, Compositional Video Question Answering Muhammad Abdul Wahab Licheng Yu Mounir Nasr Allah Tamara L. Berg 36 617 0 05 Sep 2018
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering Medhini Narasimhan Alex Schwing 24 105 0 04 Sep 2018
Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms Avikalp Srivastava Hsin Wen Liu Sumio Fujita 28 3 0 29 Aug 2018
Multimodal Differential Network for Visual Question Generation Badri N. Patro Sandeep Kumar V. Kurmi Vinay P. Namboodiri 21 41 0 12 Aug 2018
A Joint Sequence Fusion Model for Video Question Answering and Retrieval Youngjae Yu Jongseok Kim Gunhee Kim 40 340 0 07 Aug 2018
Learning Visual Question Answering by Bootstrapping Hard Attention Mateusz Malinowski Carl Doersch Adam Santoro Peter W. Battaglia OOD 27 96 0 01 Aug 2018
Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining Yundong Zhang Juan Carlos Niebles Á. Soto 35 67 0 01 Aug 2018
Pedestrian Trajectory Prediction with Structured Memory Hierarchies Tharindu Fernando Simon Denman Sridha Sridharan Clinton Fookes 19 18 0 22 Jul 2018
Modularity Matters: Learning Invariant Relational Reasoning Tasks Jason Jo Vikas Verma Yoshua Bengio OOD 11 8 0 18 Jun 2018
Grounded Textual Entailment H. Vu Claudio Greco A. Erofeeva Somayeh Jafaritazehjan Guido M. Linders Marc Tanti A. Testoni Raffaella Bernardi Albert Gatt 24 29 0 14 Jun 2018