v1v2v3v4v5v6v7 (latest)

VQA: Visual Question Answering

3 May 2015

Devi Parikh

Papers citing "VQA: Visual Question Answering"

50 / 2,957 papers shown

Title
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering Medhini Narasimhan Svetlana Lazebnik Alex Schwing NAI GNN ReLM 69 11 0 01 Nov 2018
A Corpus for Reasoning About Natural Language Grounded in Photographs Alane Suhr Stephanie Zhou Ally Zhang Iris Zhang Huajun Bai Yoav Artzi LRM 120 610 0 01 Nov 2018
How2: A Large-scale Dataset for Multimodal Language Understanding Ramon Sanabria Ozan Caglayan Shruti Palaskar Desmond Elliott Loïc Barrault Lucia Specia Florian Metze VGen MLLM 107 292 0 01 Nov 2018
TallyQA: Answering Complex Counting Questions Manoj Acharya Kushal Kafle Christopher Kanan 71 125 0 29 Oct 2018
Do Explanations make VQA Models more Predictable to a Human? Arjun Chandrasekaran Viraj Prabhu Deshraj Yadav Prithvijit Chattopadhyay Devi Parikh FAtt 150 97 0 29 Oct 2018
Middle-Out Decoding Shikib Mehri Leonid Sigal 68 22 0 28 Oct 2018
Fabrik: An Online Collaborative Neural Network Editor Utsav Garg Viraj Prabhu Deshraj Yadav Ram Ramrakhya Harsh Agrawal Dhruv Batra GNN 65 4 0 27 Oct 2018
Engaging Image Captioning Via Personality Kurt Shuster Samuel Humeau Hexiang Hu Antoine Bordes Jason Weston 87 152 0 25 Oct 2018
Understand, Compose and Respond - Answering Visual Questions by a Composition of Abstract Procedures B. Vatashsky S. Ullman CoGe 72 1 0 25 Oct 2018
Improving Context Modelling in Multimodal Dialogue Generation Shubham Agarwal Ondrej Dusek Ioannis Konstas Verena Rieser 71 19 0 20 Oct 2018
A Knowledge-Grounded Multimodal Search-Based Conversational Agent Shubham Agarwal Ondrej Dusek Ioannis Konstas Verena Rieser 76 22 0 20 Oct 2018
Cross-Modal and Hierarchical Modeling of Video and Text Bowen Zhang Hexiang Hu Fei Sha BDL AI4TS 84 191 0 16 Oct 2018
Learning to Globally Edit Images with Textual Description Hai Wang Jason D. Williams Sin-Han Kang DiffM 75 18 0 13 Oct 2018
Overcoming Language Priors in Visual Question Answering with Adversarial Regularization S. Ramakrishnan Aishwarya Agrawal Stefan Lee AAML 72 239 0 08 Oct 2018
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding Kexin Yi Jiajun Wu Chuang Gan Antonio Torralba Pushmeet Kohli J. Tenenbaum NAI 121 614 0 04 Oct 2018
Transfer Learning via Unsupervised Task Discovery for Visual Question Answering Hyeonwoo Noh Taehoon Kim Jonghwan Mun Bohyung Han 86 17 0 03 Oct 2018
Image as Data: Automated Visual Content Analysis for Political Science Jungseock Joo Zachary C. Steinert-Threlkeld 48 42 0 03 Oct 2018
Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition Jianwei Yang Jiasen Lu Stefan Lee Dhruv Batra Devi Parikh 103 42 0 01 Oct 2018
Learning Robust, Transferable Sentence Representations for Text Classification Wasi Uddin Ahmad Xueying Bai Nanyun Peng Kai-Wei Chang AI4TS OOD 61 5 0 28 Sep 2018
A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC Mark Yatskar 93 97 0 27 Sep 2018
Textually Enriched Neural Module Networks for Visual Question Answering Khyathi Chandu Mary Arpita Pyreddy Matthieu Felix N. Joshi 56 6 0 23 Sep 2018
Multimodal Dual Attention Memory for Video Story Question Answering Kyung-Min Kim Seongho Choi Jin-Hwa Kim Byoung-Tak Zhang 77 77 0 21 Sep 2018
Lessons learned in multilingual grounded language learning Ákos Kádár Desmond Elliott Marc-Alexandre Côté Grzegorz Chrupała Afra Alishahi VLM 112 24 0 20 Sep 2018
MTLE: A Multitask Learning Encoder of Visual Feature Representations for Video and Movie Description Oliver A. Nina Washington Garcia Scott Clouse Alper Yilmaz 30 4 0 19 Sep 2018
LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts Shuming Ma Lei Cui Damai Dai Furu Wei Xu Sun VGen 72 63 0 13 Sep 2018
The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA Shailza Jolly Sandro Pezzelle T. Klein Andreas Dengel Moin Nabi 39 2 0 12 Sep 2018
Deep Learning Based Multi-modal Addressee Recognition in Visual Scenes with Utterances Thao Le Minh N. Shimizu Takashi Miyazaki Koichi Shinoda 32 13 0 12 Sep 2018
Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions M. Wagner H. Basevi Rakshith Shetty Wenbin Li Mateusz Malinowski M. Fritz A. Leonardis 68 29 0 11 Sep 2018
The Visual QA Devil in the Details: The Impact of Early Fusion and Batch Norm on CLEVR Mateusz Malinowski Carl Doersch ReLM 65 12 0 11 Sep 2018
Context-Dependent Diffusion Network for Visual Relationship Detection Zhen Cui Chunyan Xu Wenming Zheng Jian Yang GNN 79 50 0 11 Sep 2018
How clever is the FiLM model, and how clever can it be? A. Kuhnle Huiyuan Xie Ann A. Copestake 68 6 0 09 Sep 2018
Faithful Multimodal Explanation for Visual Question Answering Jialin Wu Raymond J. Mooney 85 91 0 08 Sep 2018
Using Sparse Semantic Embeddings Learned from Multimodal Text and Image Data to Model Human Conceptual Knowledge Steven Derby Paul Miller B. Murphy Barry Devereux 36 15 0 07 Sep 2018
Cascaded Mutual Modulation for Visual Reasoning Yiqun Yao Jiaming Xu Feng Wang Bo Xu LRM 60 14 0 06 Sep 2018
Visual Coreference Resolution in Visual Dialog using Neural Module Networks Satwik Kottur José M. F. Moura Devi Parikh Dhruv Batra Marcus Rohrbach 77 165 0 06 Sep 2018
Interpretable Visual Question Answering by Reasoning on Dependency Trees Qingxing Cao Bailin Li Xiaodan Liang Liang Lin 72 56 0 06 Sep 2018
TVQA: Localized, Compositional Video Question Answering Muhammad Abdul Wahab Licheng Yu Mounir Nasr Allah Tamara L. Berg 116 643 0 05 Sep 2018
Retinal Vessel Segmentation under Extreme Low Annotation: A Generative Adversarial Network Approach A. Lahiri V. Jain Arnab Kumar Mondal P. Biswas GAN MedIm 73 12 0 05 Sep 2018
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering Medhini Narasimhan Alex Schwing 79 105 0 04 Sep 2018
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes Semih Yagcioglu Aykut Erdem Erkut Erdem Nazli Ikizler-Cinbis CoGe 64 173 0 04 Sep 2018
Diverse and Coherent Paragraph Generation from Images Moitreya Chatterjee Alex Schwing 75 67 0 03 Sep 2018
Learning to Describe Differences Between Pairs of Similar Images Harsh Jhamtani Taylor Berg-Kirkpatrick 90 155 0 31 Aug 2018
Towards a Better Metric for Evaluating Question Generation Systems Preksha Nema Mitesh M. Khapra 95 108 0 30 Aug 2018
Adapting Visual Question Answering Models for Enhancing Multimodal Community Q&A Platforms Avikalp Srivastava Hsin Wen Liu Sumio Fujita 30 3 0 29 Aug 2018
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering Jianmo Ni Chenguang Zhu Weizhu Chen Julian McAuley RALM 89 38 0 28 Aug 2018
Convolutional Neural Networks for Aerial Vehicle Detection and Recognition Amir Soleimani Nasser M. Nasrabadi E. Griffith J. Ralph Simon Maskell 28 10 0 26 Aug 2018
The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers Dongxiang Zhang Lei Wang Nuo Xu B. Dai Heng Tao Shen ReLM AIMat 98 127 0 22 Aug 2018
CoQA: A Conversational Question Answering Challenge Siva Reddy Danqi Chen Christopher D. Manning RALM HAI 158 1,213 0 21 Aug 2018
Auto-Classification of Retinal Diseases in the Limit of Sparse Data Using a Two-Streams Machine Learning Model Chao-Han Huck Yang Fangyu Liu Jia-Hong Huang Meng Tian Hiromasa Morikawa I-Hung Lin Yi-Chieh Liu Hao-Hsiang Yang Jesper N. Tegnér 81 18 0 16 Aug 2018
Context-Aware Visual Policy Network for Sequence-Level Image Captioning Daqing Liu Zhengjun Zha Hanwang Zhang Yongdong Zhang Feng Wu CLIP 103 104 0 16 Aug 2018