Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding

6 June 2016

Papers citing "Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding"

25 / 225 papers shown

Title
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering Y. Jang Yale Song Youngjae Yu Youngjin Kim Gunhee Kim 32 546 0 14 Apr 2017
Learning Two-Branch Neural Networks for Image-Text Matching Tasks Liwei Wang Yin Li Jing-ling Huang Svetlana Lazebnik VLM 27 494 0 11 Apr 2017
Stochastic Neural Networks for Hierarchical Reinforcement Learning Carlos Florensa Yan Duan Pieter Abbeel BDL 32 360 0 10 Apr 2017
AMC: Attention guided Multi-modal Correlation Learning for Image Search Kan Chen Trung Bui Chen Fang Zhaowen Wang Ram Nevatia 35 38 0 03 Apr 2017
An Analysis of Visual Question Answering Algorithms Kushal Kafle Christopher Kanan 30 230 0 28 Mar 2017
Multimodal Compact Bilinear Pooling for Multimodal Neural Machine Translation Jean-Benoit Delbrouck Stéphane Dupont 35 30 0 23 Mar 2017
Recurrent Multimodal Interaction for Referring Image Segmentation Chenxi Liu Zhe-nan Lin Xiaohui Shen Jimei Yang Xin Lu Alan Yuille EgoV 36 234 0 23 Mar 2017
Person Search with Natural Language Description Shuang Li Tong Xiao Hongsheng Li Bolei Zhou Dayu Yue Xiaogang Wang 19 385 0 19 Feb 2017
Comprehension-guided referring expressions Ruotian Luo Gregory Shakhnarovich ObjD 29 171 0 12 Jan 2017
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions Licheng Yu Hao Tan Joey Tianyi Zhou Tamara L. Berg ObjD 37 272 0 30 Dec 2016
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions Peng Wang Qi Wu Chunhua Shen Anton Van Den Hengel OOD 23 86 0 16 Dec 2016
Attentive Explanations: Justifying Decisions and Pointing to the Evidence Dong Huk Park Lisa Anne Hendricks Zeynep Akata Bernt Schiele Trevor Darrell Marcus Rohrbach AAML 21 79 0 14 Dec 2016
MarioQA: Answering Questions by Watching Gameplay Videos Jonghwan Mun Paul Hongsuck Seo Ilchae Jung Bohyung Han 42 108 0 06 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering Yash Goyal Tejas Khot D. Summers-Stay Dhruv Batra Devi Parikh CoGe 104 3,120 0 02 Dec 2016
Modeling Relationships in Referential Expressions with Compositional Modular Networks Ronghang Hu Marcus Rohrbach Jacob Andreas Trevor Darrell Kate Saenko 42 401 0 30 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching Hyeonseob Nam Jung-Woo Ha Jeonghee Kim 34 664 0 02 Nov 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering Youngjae Yu Hyungjin Ko Jongwook Choi Gunhee Kim 14 229 0 10 Oct 2016
Tutorial on Answering Questions about Images with Deep Learning Mateusz Malinowski Mario Fritz VLM 34 3 0 04 Oct 2016
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA) Andrew Shin Yoshitaka Ushiku Tatsuya Harada 44 14 0 21 Sep 2016
Machine Comprehension Using Match-LSTM and Answer Pointer Shuohang Wang Jing Jiang 13 594 0 29 Aug 2016
Solving Visual Madlibs with Multiple Cues Tatiana Tommasi Arun Mallya Bryan A. Plummer Svetlana Lazebnik Alexander C. Berg Tamara L. Berg 37 18 0 11 Aug 2016
Semantic Parsing to Probabilistic Programs for Situated Question Answering Jayant Krishnamurthy Oyvind Tafjord Aniruddha Kembhavi 29 24 0 22 Jun 2016
Adversarial Feature Learning Jiasen Lu Philipp Krahenbuhl Trevor Darrell GAN 41 1,598 0 31 May 2016
Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering Arun Mallya Svetlana Lazebnik 36 119 0 16 Apr 2016
Explicit Knowledge-based Reasoning for Visual Question Answering Peng Wang Qi Wu Chunhua Shen Anton Van Den Hengel A. Dick 39 257 0 09 Nov 2015