v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015

Jimmy Ba

Aaron Courville

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown

Title
Improving the Performance of Neural Machine Translation Involving Morphologically Rich Languages Hans Krupakar R. S. Milton 91 16 0 07 Dec 2016
Spatially Adaptive Computation Time for Residual Networks Michael Figurnov Maxwell D. Collins Yukun Zhu Li Zhang Jonathan Huang Dmitry Vetrov Ruslan Salakhutdinov 75 351 0 07 Dec 2016
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning Jiasen Lu Caiming Xiong Devi Parikh R. Socher 146 1,458 0 06 Dec 2016
Condensed Memory Networks for Clinical Diagnostic Inferencing Aaditya (Adi) Prakash Siyuan Zhao Sadid A. Hasan Vivek Datla Kathy Lee Ashequl Qadir Joey Liu Oladimeji Farri 69 103 0 06 Dec 2016
Learning to Detect Multiple Photographic Defects Ning Yu Xiaohui Shen Zhe Lin R. Měch Connelly Barnes 76 14 0 06 Dec 2016
ImageNet pre-trained models with batch normalization Marcel Simon E. Rodner Joachim Denzler VLM SSeg 104 166 0 05 Dec 2016
Areas of Attention for Image Captioning M. Pedersoli Thomas Lucas Cordelia Schmid Jakob Verbeek 117 206 0 03 Dec 2016
Parameter Compression of Recurrent Neural Networks and Degradation of Short-term Memory Jonathan A. Cox 22 5 0 02 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search Peter Anderson Basura Fernando Mark Johnson Stephen Gould 97 238 0 02 Dec 2016
Self-critical Sequence Training for Image Captioning Steven J. Rennie E. Marcheret Youssef Mroueh Jerret Ross Vaibhava Goel 149 1,898 0 02 Dec 2016
Temporal Attention-Gated Model for Robust Sequence Classification Wenjie Pei T. Baltrušaitis David Tax Louis-Philippe Morency 82 89 0 01 Dec 2016
Improved Image Captioning via Policy Gradient optimization of SPIDEr Siqi Liu Zhenhai Zhu Ning Ye S. Guadarrama Kevin Patrick Murphy 181 446 0 01 Dec 2016
Video Captioning with Multi-Faceted Attention Xiang Long Chuang Gan Gerard de Melo 87 88 0 01 Dec 2016
Sync-DRAW: Automatic Video Generation using Deep Recurrent Attentive Architectures Gaurav Mittal Tanya Marwah V. Balasubramanian VGen DiffM 101 67 0 30 Nov 2016
Modeling Relationships in Referential Expressions with Compositional Modular Networks Ronghang Hu Marcus Rohrbach Jacob Andreas Trevor Darrell Kate Saenko 84 407 0 30 Nov 2016
Attend in groups: a weakly-supervised deep learning framework for learning from web data Bohan Zhuang Lingqiao Liu Yao Li Chunhua Shen Ian Reid NoLa 69 89 0 30 Nov 2016
Context-aware Natural Language Generation with Recurrent Neural Networks Jian Tang Yifan Yang Samuel Carton Ming Zhang Qiaozhu Mei 82 67 0 29 Nov 2016
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model Marcella Cornia Lorenzo Baraldi G. Serra Rita Cucchiara 130 552 0 29 Nov 2016
Deep Quantization: Encoding Convolutional Activations with Deep Generative Model Zhaofan Qiu Ting Yao Tao Mei DRL MQ 83 60 0 29 Nov 2016
Emergence of foveal image sampling from learning to attend in visual scenes Brian Cheung E. Weiss Bruno A. Olshausen 89 39 0 28 Nov 2016
Hierarchical Boundary-Aware Neural Encoder for Video Captioning Lorenzo Baraldi C. Grana Rita Cucchiara 82 192 0 28 Nov 2016
Attention-based Memory Selection Recurrent Network for Language Modeling Da-Rong Liu Shun-Po Chuang Hung-yi Lee RALM KELM 45 5 0 26 Nov 2016
Neural Machine Translation with Latent Semantic of Image and Text Joji Toyama Masanori Misono Masahiro Suzuki Kotaro Nakayama Y. Matsuo 134 14 0 25 Nov 2016
An Overview on Data Representation Learning: From Traditional Feature Learning to Recent Deep Learning G. Zhong Lina Wang Junyu Dong AI4TS 83 183 0 25 Nov 2016
Semantic Compositional Networks for Visual Captioning Zhe Gan Chuang Gan Xiaodong He Yunchen Pu Kenneth Tran Jianfeng Gao Lawrence Carin Li Deng CoGe 114 427 0 23 Nov 2016
GuessWhat?! Visual object discovery through multi-modal dialogue H. D. Vries Florian Strub A. Chandar Olivier Pietquin Hugo Larochelle Aaron Courville VLM 112 428 0 23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text Yunchen Pu Martin Renqiang Min Zhe Gan Lawrence Carin 72 14 0 23 Nov 2016
Recurrent Attention Models for Depth-Based Person Identification Albert Haque Alexandre Alahi Li Fei-Fei 3DH 87 142 0 22 Nov 2016
GRAM: Graph-based Attention Model for Healthcare Representation Learning Edward Choi M. T. Bahadori Le Song Walter F. Stewart Jimeng Sun GNN 97 678 0 21 Nov 2016
Coherent Dialogue with Attention-based Language Models Hongyuan Mei Joey Tianyi Zhou Matthew R. Walter AuLLM 80 83 0 21 Nov 2016
Dense Captioning with Joint Inference and Visual Context L. Yang K. Tang Jianchao Yang Li Li VLM 103 170 0 21 Nov 2016
Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues Bryan A. Plummer Arun Mallya Christopher M. Cervantes Julia Hockenmaier Svetlana Lazebnik 142 189 0 21 Nov 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs J. Krause Justin Johnson Ranjay Krishna Li Fei-Fei VLM 106 379 0 20 Nov 2016
Recurrent Memory Addressing for describing videos A. Jain Abhinav Agarwalla Kumar Krishna Agrawal Pabitra Mitra 62 10 0 20 Nov 2016
An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data Sijie Song Cuiling Lan Junliang Xing Wenjun Zeng Jiaying Liu 202 991 0 18 Nov 2016
Cross Domain Knowledge Transfer for Person Re-identification Qiqi Xiao Kelei Cao Haonan Chen Fangyue Peng Fangqiu Yi 91 18 0 18 Nov 2016
AutoScaler: Scale-Attention Networks for Visual Correspondence Shenlong Wang Linjie Luo Ning Zhang Jia Li 66 19 0 17 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning Long Chen Hanwang Zhang Jun Xiao Liqiang Nie Jian Shao Wei Liu Tat-Seng Chua 115 1,667 0 17 Nov 2016
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM Yan Huang Wei Wang Liang Wang 114 223 0 17 Nov 2016
DelugeNets: Deep Networks with Efficient and Flexible Cross-layer Information Inflows Jason Kuen Xiangfei Kong G. Wang Yap-Peng Tan 70 14 0 17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation Feng Liu Tao Xiang Timothy M. Hospedales Wankou Yang Changyin Sun 107 104 0 16 Nov 2016
A Semi-supervised Framework for Image Captioning Wenhu Chen Aurelien Lucchi Thomas Hofmann 92 9 0 16 Nov 2016
The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives Mohit Iyyer Varun Manjunatha Anupam Guha Yogarshi Vyas Jordan L. Boyd-Graber Hal Daumé L. Davis 87 101 0 16 Nov 2016
Diversity encouraged learning of unsupervised LSTM ensemble for neural activity video prediction Yilin Song J. Viventi Yao Wang AI4TS 54 2 0 15 Nov 2016
Hierarchical Object Detection with Deep Reinforcement Learning Míriam Bellver Xavier Giró-i-Nieto F. Marqués Jordi Torres 80 105 0 11 Nov 2016
Getting Started with Neural Models for Semantic Matching in Web Search Kezban Dilek Onal I. S. Altingövde Pinar Senkul Maarten de Rijke VLM 3DV 63 9 0 08 Nov 2016
Memory-augmented Attention Modelling for Videos Rasool Fakoor Abdel-rahman Mohamed Margaret Mitchell S. B. Kang Pushmeet Kohli 115 20 0 07 Nov 2016
Latent Attention For If-Then Program Synthesis Xinyun Chen Chang-rui Liu E. C. Shin Basel Alomair Mingcheng Chen 83 70 0 07 Nov 2016
Hierarchical Question Answering for Long Documents Eunsol Choi D. Hewlett Alexandre Lacoste Illia Polosukhin Jakob Uszkoreit Jonathan Berant RALM 99 168 0 06 Nov 2016
Boosting Image Captioning with Attributes Ting Yao Yingwei Pan Yehao Li Zhaofan Qiu Tao Mei VLM 132 624 0 05 Nov 2016