Explain Images with Multimodal Recurrent Neural Networks

4 October 2014

Yi Yang

Papers citing "Explain Images with Multimodal Recurrent Neural Networks"

50 / 116 papers shown

Title
Commonly Uncommon: Semantic Sparsity in Situation Recognition Mark Yatskar Vicente Ordonez Luke Zettlemoyer Ali Farhadi VLM 17 42 0 03 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering Yash Goyal Tejas Khot D. Summers-Stay Dhruv Batra Devi Parikh CoGe 155 3,136 0 02 Dec 2016
Video Captioning with Transferred Semantic Attributes Yingwei Pan Ting Yao Houqiang Li Tao Mei 27 329 0 23 Nov 2016
Dense Captioning with Joint Inference and Visual Context L. Yang K. Tang Jianchao Yang Li-Jia Li VLM 30 169 0 21 Nov 2016
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM Yan Huang Wei Wang Liang Wang 26 222 0 17 Nov 2016
A Semi-supervised Framework for Image Captioning Wenhu Chen Aurelien Lucchi Thomas Hofmann 37 9 0 16 Nov 2016
Boosting Image Captioning with Attributes Ting Yao Yingwei Pan Yehao Li Zhaofan Qiu Tao Mei VLM 48 620 0 05 Nov 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge Oriol Vinyals Alexander Toshev Samy Bengio D. Erhan 30 851 0 21 Sep 2016
GeThR-Net: A Generalized Temporally Hybrid Recurrent Neural Network for Multimodal Information Fusion Ankit Gandhi Arjun Sharma Arijit Biswas Om Deshmukh AI4TS 21 12 0 17 Sep 2016
Linking Image and Text with 2-Way Nets Aviv Eisenschtat Lior Wolf 27 176 0 29 Aug 2016
Learning to generalize to new compositions in image understanding Yuval Atzmon Jonathan Berant Vahid Kezami Amir Globerson Gal Chechik 26 67 0 27 Aug 2016
DeepDiary: Automatic Caption Generation for Lifelogging Image Streams Chenyou Fan David J. Crandall DiffM 14 5 0 12 Aug 2016
Multilingual Visual Sentiment Concept Matching Nikolaos Pappas Miriam Redi Mercan Topkara Brendan Jou Hongyi Liu Tao Chen Shih-Fu Chang CVBM 26 14 0 07 Jun 2016
Automated Image Captioning for Rapid Prototyping and Resource Constrained Environments Karan Sharma Arun C. S. Kumar S. Bhandarkar 20 0 0 04 Jun 2016
Annotation Order Matters: Recurrent Image Annotator for Arbitrary Length Image Tagging Jiren Jin Hideki Nakayama 3DV VLM 30 69 0 18 Apr 2016
Generating Visual Explanations Lisa Anne Hendricks Zeynep Akata Marcus Rohrbach Jeff Donahue Bernt Schiele Trevor Darrell VLM FAtt 47 618 0 28 Mar 2016
Learning to Read Chest X-Rays: Recurrent Neural Cascade Model for Automated Image Annotation Hoo-Chang Shin Kirk Roberts Le Lu Dina Demner-Fushman Jianhua Yao Ronald M. Summers 24 347 0 28 Mar 2016
Content-based Video Indexing and Retrieval Using Corr-LDA R. Iyer Sanjeel Parekh Vikas Mohandoss Anush Ramsurat Bhiksha Raj Rita Singh 16 22 0 27 Feb 2016
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations Ranjay Krishna Yuke Zhu Oliver Groth Justin Johnson Kenji Hata ... Yannis Kalantidis Li-Jia Li David A. Shamma Michael S. Bernstein Fei-Fei Li 108 5,663 0 23 Feb 2016
Generate Image Descriptions based on Deep RNN and Memory Cells for Images Features Shijian Tang Song Han VLM 20 1 0 05 Feb 2016
Event Specific Multimodal Pattern Mining with Image-Caption Pairs Hongzhi Li Joseph G. Ellis Shih-Fu Chang 6 2 0 31 Dec 2015
RNN Fisher Vectors for Action Recognition and Image Annotation Guy Lev Gil Sadeh Benjamin Klein Lior Wolf 19 163 0 12 Dec 2015
Neural Self Talk: Image Understanding via Continuous Questioning and Answering Yezhou Yang Yi Li Cornelia Fermuller Yiannis Aloimonos 19 24 0 10 Dec 2015
Natural Language Understanding with Distributed Representation Kyunghyun Cho GNN BDL 21 55 0 24 Nov 2015
DenseCap: Fully Convolutional Localization Networks for Dense Captioning Justin Johnson A. Karpathy Li Fei-Fei VLM 74 1,160 0 24 Nov 2015
Where To Look: Focus Regions for Visual Question Answering Kevin J. Shih Saurabh Singh Derek Hoiem 34 456 0 23 Nov 2015
Visual Word2Vec (vis-w2v): Learning Visually Grounded Word Embeddings Using Abstract Scenes Satwik Kottur Ramakrishna Vedantam José M. F. Moura Devi Parikh VLM 38 85 0 22 Nov 2015
Asymmetrically Weighted CCA And Hierarchical Kernel Sentence Embedding For Image & Text Retrieval Youssef Mroueh E. Marcheret Vaibhava Goel 21 3 0 19 Nov 2015
Recurrent Neural Networks Hardware Implementation on FPGA Andre Xian Ming Chang B. Martini Eugenio Culurciello 27 126 0 17 Nov 2015
Yin and Yang: Balancing and Answering Binary Visual Questions Peng Zhang Yash Goyal D. Summers-Stay Dhruv Batra Devi Parikh CoGe 37 349 0 16 Nov 2015
From Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge Somak Aditya Yezhou Yang Chitta Baral Cornelia Fermuller Yiannis Aloimonos 3DV 19 69 0 10 Nov 2015
Automatic Concept Discovery from Parallel Text and Visual Corpora Chen Sun Chuang Gan Ram Nevatia CoGe 12 107 0 24 Sep 2015
Image Representations and New Domains in Neural Image Captioning Jack Hessel Nicolas Savva Michael J. Wilber VLM 30 16 0 09 Aug 2015
Describing Multimedia Content using Attention-based Encoder--Decoder Networks Kyunghyun Cho Aaron Courville Yoshua Bengio 32 411 0 04 Jul 2015
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books Yukun Zhu Ryan Kiros R. Zemel Ruslan Salakhutdinov R. Urtasun Antonio Torralba Sanja Fidler 60 2,517 0 22 Jun 2015
Aligning where to see and what to tell: image caption with region-based attention and scene factorization Junqi Jin Kun Fu Runpeng Cui Fei Sha Changshui Zhang 34 117 0 20 Jun 2015
Learning language through pictures Grzegorz Chrupała Ákos Kádár A. Alishahi VLM SSL 35 65 0 11 Jun 2015
Learning to Answer Questions From Image Using Convolutional Neural Network Lin Ma Zhengdong Lu Hang Li 27 261 0 01 Jun 2015
A Multi-scale Multiple Instance Video Description Network Huijuan Xu Subhashini Venugopalan Vasili Ramanishka Marcus Rohrbach Kate Saenko 40 64 0 21 May 2015
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering Haoyuan Gao Junhua Mao Jie Zhou Zhiheng Huang Lei Wang Wenyuan Xu 32 496 0 21 May 2015
Visual Semantic Role Labeling Saurabh Gupta Jitendra Malik 29 404 0 17 May 2015
Exploring Nearest Neighbor Approaches for Image Captioning Jacob Devlin Saurabh Gupta Ross B. Girshick Margaret Mitchell C. L. Zitnick 27 195 0 17 May 2015
Exploring Models and Data for Image Question Answering Mengye Ren Ryan Kiros R. Zemel 44 711 0 08 May 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language Yingwei Pan Tao Mei Ting Yao Houqiang Li Y. Rui 41 535 0 07 May 2015
Language Models for Image Captioning: The Quirks and What Works Jacob Devlin Hao Cheng Hao Fang Saurabh Gupta Li Deng Xiaodong He Geoffrey Zweig Margaret Mitchell 32 281 0 07 May 2015
VQA: Visual Question Answering Aishwarya Agrawal Jiasen Lu Stanislaw Antol Margaret Mitchell C. L. Zitnick Dhruv Batra Devi Parikh CoGe 96 5,383 0 03 May 2015
Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images Junhua Mao Xu Wei Yi Yang Jiang Wang Zhiheng Huang Alan Yuille 25 154 0 25 Apr 2015
Multimodal Convolutional Neural Networks for Matching Image and Sentence Lin Ma Zhengdong Lu Lifeng Shang Hang Li 38 337 0 23 Apr 2015
Microsoft COCO Captions: Data Collection and Evaluation Server Xinlei Chen Hao Fang Nayeon Lee Ramakrishna Vedantam Saurabh Gupta Piotr Dollar C. L. Zitnick 97 2,434 0 01 Apr 2015
Generating Multi-Sentence Lingual Descriptions of Indoor Scenes Dahua Lin Chen Kong Sanja Fidler R. Urtasun 3DV 18 27 0 28 Feb 2015