Show and Tell: A Neural Image Caption Generator

17 November 2014

Papers citing "Show and Tell: A Neural Image Caption Generator"

50 / 2,022 papers shown

Title
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning Jiasen Lu Caiming Xiong Devi Parikh R. Socher 85 1,442 0 06 Dec 2016
Multi-Label Image Classification with Regional Latent Semantic Dependencies Junjie Zhang Qi Wu Chunhua Shen Jian Zhang Jianfeng Lu 25 165 0 04 Dec 2016
Areas of Attention for Image Captioning M. Pedersoli Thomas Lucas Cordelia Schmid Jakob Verbeek 33 205 0 03 Dec 2016
Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework Yuankai Wu Huachun Tan AI4TS 34 248 0 03 Dec 2016
Commonly Uncommon: Semantic Sparsity in Situation Recognition Mark Yatskar Vicente Ordonez Luke Zettlemoyer Ali Farhadi VLM 9 42 0 03 Dec 2016
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering Yash Goyal Tejas Khot D. Summers-Stay Dhruv Batra Devi Parikh CoGe 125 3,126 0 02 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search Peter Anderson Basura Fernando Mark Johnson Stephen Gould 21 232 0 02 Dec 2016
Self-critical Sequence Training for Image Captioning Steven J. Rennie E. Marcheret Youssef Mroueh Jerret Ross Vaibhava Goel 11 1,877 0 02 Dec 2016
Improved Image Captioning via Policy Gradient optimization of SPIDEr Siqi Liu Zhenhai Zhu Ning Ye S. Guadarrama Kevin Patrick Murphy 39 440 0 01 Dec 2016
Video Captioning with Multi-Faceted Attention Xiang Long Chuang Gan Gerard de Melo 24 88 0 01 Dec 2016
Towards Robust Deep Neural Networks with BANG Andras Rozsa Manuel Günther Terrance E. Boult AAML OOD 19 76 0 01 Dec 2016
Sequential Person Recognition in Photo Albums with a Recurrent Network Yao Li Guosheng Lin Bohan Zhuang Lingqiao Liu Chunhua Shen Anton Van Den Hengel 24 29 0 30 Nov 2016
Getting Closer to the Essence of Music: The Con Espressione Manifesto Gerhard Widmer 13 14 0 29 Nov 2016
Deep Quantization: Encoding Convolutional Activations with Deep Generative Model Zhaofan Qiu Ting Yao Tao Mei DRL MQ 26 58 0 29 Nov 2016
Hierarchical Boundary-Aware Neural Encoder for Video Captioning Lorenzo Baraldi C. Grana Rita Cucchiara 28 191 0 28 Nov 2016
Image Based Appraisal of Real Estate Properties Quanzeng You Ran Pang Liangliang Cao Jiebo Luo 18 68 0 28 Nov 2016
Visual Dialog Abhishek Das Satwik Kottur Khushi Gupta Avi Singh Deshraj Yadav José M. F. Moura Devi Parikh Dhruv Batra 69 990 0 26 Nov 2016
Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images Junhua Mao Jiajing Xu Yushi Jing Alan Yuille 11 48 0 24 Nov 2016
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling Zhe Gan Chunyuan Li Changyou Chen Yunchen Pu Qinliang Su Lawrence Carin BDL UQCV 53 41 0 23 Nov 2016
Semantic Compositional Networks for Visual Captioning Zhe Gan Chuang Gan Xiaodong He Yunchen Pu Kenneth Tran Jianfeng Gao Lawrence Carin Li Deng CoGe 47 425 0 23 Nov 2016
GuessWhat?! Visual object discovery through multi-modal dialogue H. D. Vries Florian Strub A. Chandar Olivier Pietquin Hugo Larochelle Aaron Courville VLM 32 426 0 23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text Yunchen Pu Martin Renqiang Min Zhe Gan Lawrence Carin 41 14 0 23 Nov 2016
Dense Captioning with Joint Inference and Visual Context L. Yang K. Tang Jianchao Yang Li-Jia Li VLM 30 169 0 21 Nov 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs J. Krause Justin Johnson Ranjay Krishna Li Fei-Fei VLM 36 373 0 20 Nov 2016
Recurrent Memory Addressing for describing videos A. Jain Abhinav Agarwalla Kumar Krishna Agrawal Pabitra Mitra 38 10 0 20 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning Long Chen Hanwang Zhang Jun Xiao Liqiang Nie Jian Shao Wei Liu Tat-Seng Chua 24 1,650 0 17 Nov 2016
Multimodal Memory Modelling for Video Captioning Junbo Wang Wei Wang Yan Huang Liang Wang Tieniu Tan 32 142 0 17 Nov 2016
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM Yan Huang Wei Wang Liang Wang 26 222 0 17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation Feng Liu Tao Xiang Timothy M. Hospedales Wankou Yang Changyin Sun 34 103 0 16 Nov 2016
A Semi-supervised Framework for Image Captioning Wenhu Chen Aurelien Lucchi Thomas Hofmann 34 9 0 16 Nov 2016
The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives Mohit Iyyer Varun Manjunatha Anupam Guha Yogarshi Vyas Jordan L. Boyd-Graber Hal Daumé L. Davis 30 95 0 16 Nov 2016
Zero-resource Machine Translation by Multimodal Encoder-decoder Network with Multimedia Pivot Hideki Nakayama Noriki Nishida 29 62 0 14 Nov 2016
Leveraging Video Descriptions to Learn Video Question Answering Kuo-Hao Zeng Tseng-Hung Chen Ching-Yao Chuang Yuan-Hong Liao Juan Carlos Niebles Min Sun 32 175 0 12 Nov 2016
Unsupervised Pretraining for Sequence to Sequence Learning Prajit Ramachandran Peter J. Liu Quoc V. Le SSL AIMat 32 281 0 08 Nov 2016
Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents Rui Zhang Honglak Lee Dragomir R. Radev 16 123 0 08 Nov 2016
Memory-augmented Attention Modelling for Videos Rasool Fakoor Abdel-rahman Mohamed Margaret Mitchell S. B. Kang Pushmeet Kohli 48 20 0 07 Nov 2016
Boosting Image Captioning with Attributes Ting Yao Yingwei Pan Yehao Li Zhaofan Qiu Tao Mei VLM 48 620 0 05 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching Hyeonseob Nam Jung-Woo Ha Jeonghee Kim 34 664 0 02 Nov 2016
Inference Compilation and Universal Probabilistic Programming T. Le A. G. Baydin Frank Wood UQCV 52 142 0 31 Oct 2016
Clinical Text Prediction with Numerically Grounded Conditional Language Models Georgios P. Spithourakis S. Petersen Sebastian Riedel 30 7 0 20 Oct 2016
Spatio-Temporal Attention Models for Grounded Video Captioning M. Zanfir Elisabeta Marinoiu C. Sminchisescu 35 50 0 17 Oct 2016
Generating captions without looking beyond objects Hendrik Heuer Christof Monz A. Smeulders 22 16 0 12 Oct 2016
Neural Paraphrase Generation with Stacked Residual LSTM Networks Aaditya (Adi) Prakash Sadid A. Hasan Kathy Lee Vivek Datla Ashequl Qadir Joey Liu Oladimeji Farri 8 264 0 10 Oct 2016
Latent Sequence Decompositions William Chan Yu Zhang Quoc V. Le Navdeep Jaitly 16 62 0 10 Oct 2016
Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models Ashwin K. Vijayakumar Michael Cogswell Ramprasaath R. Selvaraju Q. Sun Stefan Lee David J. Crandall Dhruv Batra 17 541 0 07 Oct 2016
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Ramprasaath R. Selvaraju Michael Cogswell Abhishek Das Ramakrishna Vedantam Devi Parikh Dhruv Batra FAtt 50 19,576 0 07 Oct 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges Kushal Kafle Christopher Kanan OOD 27 235 0 05 Oct 2016
Controlling Output Length in Neural Encoder-Decoders Yuta Kikuchi Graham Neubig Ryohei Sasano Hiroya Takamura Manabu Okumura 19 242 0 30 Sep 2016
Variational Autoencoder for Deep Learning of Images, Labels and Captions Yunchen Pu Zhe Gan Ricardo Henao Xin Yuan Chunyuan Li Andrew Stevens Lawrence Carin BDL CoGe 30 746 0 28 Sep 2016
Learning Language-Visual Embedding for Movie Understanding with Natural-Language Atousa Torabi Niket Tandon Leonid Sigal 22 97 0 26 Sep 2016