v1v2v3v4v5v6v7 (latest)

VQA: Visual Question Answering

3 May 2015

Devi Parikh

Papers citing "VQA: Visual Question Answering"

50 / 2,957 papers shown

Title
Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering V. Kazemi Ali Elqursh OOD 89 185 0 11 Apr 2017
Pay Attention to Those Sets! Learning Quantification from Images Ionut-Teodor Sorodoc Sandro Pezzelle Aurélie Herbelot Mariella Dimiccoli Raffaella Bernardi 37 0 0 10 Apr 2017
An Empirical Evaluation of Visual Question Answering for Novel Objects Santhosh Kumar Ramakrishnan Ambar Pal Gaurav Sharma Anurag Mittal OOD 98 32 0 08 Apr 2017
It Takes Two to Tango: Towards Theory of AI's Mind Arjun Chandrasekaran Deshraj Yadav Prithvijit Chattopadhyay Viraj Prabhu Devi Parikh 115 55 0 03 Apr 2017
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks Tanmay Gupta Kevin J. Shih Saurabh Singh Derek Hoiem 110 26 0 02 Apr 2017
Towards Building Large Scale Multimodal Domain-Aware Conversation Systems Amrita Saha Mitesh Khapra Karthik Sankaranarayanan 86 8 0 01 Apr 2017
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation Albert Gatt E. Krahmer LM&MA ELM 153 828 0 29 Mar 2017
A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment Haonan Yu Haichao Zhang Wenyuan Xu LM&Ro 82 25 0 28 Mar 2017
An Analysis of Visual Question Answering Algorithms Kushal Kafle Christopher Kanan 87 234 0 28 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation Chenxi Liu Zhe Lin Xiaohui Shen Jimei Yang Xin Lu Alan Yuille EgoV 94 241 0 23 Mar 2017
Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning Abhishek Das Satwik Kottur J. M. F. Moura Stefan Lee Dhruv Batra OffRL 157 425 0 20 Mar 2017
VQABQ: Visual Question Answering by Basic Questions Jia-Hong Huang Modar Alfadly Guohao Li 58 25 0 19 Mar 2017
Recurrent Models for Situation Recognition Arun Mallya Svetlana Lazebnik 80 30 0 18 Mar 2017
End-to-end optimization of goal-driven and visually grounded dialogue systems Florian Strub H. D. Vries Jérémie Mary Bilal Piot Aaron Courville Olivier Pietquin OffRL 83 138 0 15 Mar 2017
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection Xiaodan Liang Lisa Lee Eric Xing 88 252 0 08 Mar 2017
Asymmetric Tri-training for Unsupervised Domain Adaptation Kuniaki Saito Yoshitaka Ushiku Tatsuya Harada 149 587 0 27 Feb 2017
Visual Translation Embedding Network for Visual Relation Detection Hanwang Zhang Zawlin Kyaw Shih-Fu Chang Tat-Seng Chua ViT 249 563 0 27 Feb 2017
ViP-CNN: Visual Phrase Guided Convolutional Neural Network Yikang Li Wanli Ouyang Xiaogang Wang Xiaoóu Tang ObjD 69 48 0 23 Feb 2017
Task-driven Visual Saliency and Attention-based Visual Question Answering Yuetan Lin Zhangyang Pang Donghui Wang Yueting Zhuang 61 26 0 22 Feb 2017
Person Search with Natural Language Description Shuang Li Tong Xiao Hongsheng Li Bolei Zhou Dayu Yue Xiaogang Wang 105 396 0 19 Feb 2017
Gated Multimodal Units for Information Fusion John Arevalo Thamar Solorio Manuel Montes-y-Gómez Fabio Gonzalez 106 382 0 07 Feb 2017
Living a discrete life in a continuous world: Reference with distributed representations Gemma Boleda Sebastian Padó N. Pham Marco Baroni 20 0 0 06 Feb 2017
Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation N. Mostafazadeh Chris Brockett W. Dolan Michel Galley Jianfeng Gao Georgios P. Spithourakis Lucy Vanderwende 102 183 0 28 Jan 2017
Context-aware Captions from Context-agnostic Supervision Ramakrishna Vedantam Samy Bengio Kevin Patrick Murphy Devi Parikh Gal Chechik 96 152 0 11 Jan 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions Licheng Yu Hao Tan Joey Tianyi Zhou Tamara L. Berg ObjD 98 275 0 30 Dec 2016
Learning Visual N-Grams from Web Data Ang Li Allan Jabri Armand Joulin Laurens van der Maaten VLM 85 138 0 29 Dec 2016
Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task Nan Ding Sebastian Goodman Fei Sha Radu Soricut VLM 80 9 0 22 Dec 2016
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning Justin Johnson B. Hariharan Laurens van der Maaten Li Fei-Fei C. L. Zitnick Ross B. Girshick CoGe 357 2,394 0 20 Dec 2016
Automatic Generation of Grounded Visual Questions Shijie Zhang Zhuang Li Shaodi You Zhenglu Yang Jiawan Zhang OOD 79 79 0 20 Dec 2016
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions Peng Wang Qi Wu Chunhua Shen Anton Van Den Hengel OOD 90 86 0 16 Dec 2016
Attentive Explanations: Justifying Decisions and Pointing to the Evidence Dong Huk Park Lisa Anne Hendricks Zeynep Akata Bernt Schiele Trevor Darrell Marcus Rohrbach AAML 85 79 0 14 Dec 2016
Learning to Hash-tag Videos with Tag2Vec A. Singh Saurabh Saini R. Shah P. J. Narayanan 32 1 0 13 Dec 2016
VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering Marc Bolaños Álvaro Peris F. Casacuberta Petia Radeva 68 6 0 12 Dec 2016
MarioQA: Answering Questions by Watching Gameplay Videos Jonghwan Mun Paul Hongsuck Seo Ilchae Jung Bohyung Han 97 110 0 06 Dec 2016
ImageNet pre-trained models with batch normalization Marcel Simon E. Rodner Joachim Denzler VLM SSeg 104 166 0 05 Dec 2016
Deep Multi-Modal Image Correspondence Learning Chen Liu Jiajun Wu Pushmeet Kohli Yasutaka Furukawa 28 5 0 05 Dec 2016
Who is Mistaken? Benjamin Eysenbach Carl Vondrick Antonio Torralba 105 17 0 04 Dec 2016
Commonly Uncommon: Semantic Sparsity in Situation Recognition Mark Yatskar Vicente Ordonez Luke Zettlemoyer Ali Farhadi VLM 73 42 0 03 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering Yash Goyal Tejas Khot D. Summers-Stay Dhruv Batra Devi Parikh CoGe 397 3,275 0 02 Dec 2016
Visual Dialog Abhishek Das Satwik Kottur Khushi Gupta Avi Singh Deshraj Yadav José M. F. Moura Devi Parikh Dhruv Batra 163 1,005 0 26 Nov 2016
GuessWhat?! Visual object discovery through multi-modal dialogue H. D. Vries Florian Strub A. Chandar Olivier Pietquin Hugo Larochelle Aaron Courville VLM 110 428 0 23 Nov 2016
A dataset and exploration of models for understanding video data through fill-in-the-blank question-answering Tegan Maharaj Nicolas Ballas Anna Rohrbach Aaron Courville C. Pal VGen 80 108 0 23 Nov 2016
Dense Captioning with Joint Inference and Visual Context L. Yang K. Tang Jianchao Yang Li Li VLM 103 170 0 21 Nov 2016
Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues Bryan A. Plummer Arun Mallya Christopher M. Cervantes Julia Hockenmaier Svetlana Lazebnik 139 189 0 21 Nov 2016
Recurrent Memory Addressing for describing videos A. Jain Abhinav Agarwalla Kumar Krishna Agrawal Pabitra Mitra 62 10 0 20 Nov 2016
Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic Somak Aditya Yezhou Yang Chitta Baral Yiannis Aloimonos ReLM 21 4 0 17 Nov 2016
Nothing Else Matters: Model-Agnostic Explanations By Identifying Prediction Invariance Marco Tulio Ribeiro Sameer Singh Carlos Guestrin FAtt 68 64 0 17 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning Long Chen Hanwang Zhang Jun Xiao Liqiang Nie Jian Shao Wei Liu Tat-Seng Chua 84 1,666 0 17 Nov 2016
Zero-Shot Visual Question Answering Damien Teney Anton Van Den Hengel 90 74 0 17 Nov 2016
The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives Mohit Iyyer Varun Manjunatha Anupam Guha Yogarshi Vyas Jordan L. Boyd-Graber Hal Daumé L. Davis 85 100 0 16 Nov 2016