v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015

Jimmy Ba

Aaron Courville

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown

Title
Yin and Yang: Balancing and Answering Binary Visual Questions Peng Zhang Yash Goyal D. Summers-Stay Dhruv Batra Devi Parikh CoGe 116 352 0 16 Nov 2015
Sherlock: Scalable Fact Learning in Images Mohamed Elhoseiny Scott D. Cohen W. Chang Brian L. Price Ahmed Elgammal 59 26 0 16 Nov 2015
Neural Programmer: Inducing Latent Programs with Gradient Descent Arvind Neelakantan Quoc V. Le Ilya Sutskever ODL 129 262 0 16 Nov 2015
Uncovering Temporal Context for Video Question and Answering Linchao Zhu Zhongwen Xu Yi Yang Alexander G. Hauptmann BDL 90 45 0 15 Nov 2015
Oracle performance for visual captioning L. Yao Nicolas Ballas Kyunghyun Cho John R. Smith Yoshua Bengio VLM 111 8 0 14 Nov 2015
Reversible Recursive Instance-level Object Segmentation Xiaodan Liang Yunchao Wei Xiaohui Shen Zequn Jie Jiashi Feng Liang Lin Shuicheng Yan SSeg ISeg 67 59 0 14 Nov 2015
Semantic Object Parsing with Local-Global Long Short-Term Memory Xiaodan Liang Xiaohui Shen Donglai Xiang Jiashi Feng Liang Lin Shuicheng Yan 85 185 0 14 Nov 2015
Natural Language Object Retrieval Ronghang Hu Huazhe Xu Marcus Rohrbach Jiashi Feng Kate Saenko Trevor Darrell ObjD 147 555 0 13 Nov 2015
Action Recognition using Visual Attention Shikhar Sharma Ryan Kiros Ruslan Salakhutdinov 102 667 0 12 Nov 2015
Deep Gaussian Conditional Random Field Network: A Model-based Deep Network for Discriminative Denoising Raviteja Vemulapalli Oncel Tuzel Ming-Yuan Liu 80 70 0 12 Nov 2015
Hand-Object Interaction and Precise Localization in Transitive Action Recognition Amir Rosenfeld S. Ullman 68 8 0 12 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction Anna Rohrbach Marcus Rohrbach Ronghang Hu Trevor Darrell Bernt Schiele 90 497 0 12 Nov 2015
Generative Concatenative Nets Jointly Learn to Write and Classify Reviews Zachary Chase Lipton Sharad Vikram Julian McAuley BDL 115 33 0 11 Nov 2015
Visual7W: Grounded Question Answering in Images Yuke Zhu Oliver Groth Michael S. Bernstein Li Fei-Fei 166 891 0 11 Nov 2015
Attention to Scale: Scale-aware Semantic Image Segmentation Liang-Chieh Chen Yi Yang Jiang Wang Wei Xu Alan Yuille SSeg 162 1,322 0 10 Nov 2015
Detecting events and key actors in multi-person videos Vignesh Ramanathan Jonathan Huang Sami Abu-El-Haija Alexander N. Gorban Kevin Patrick Murphy Li Fei-Fei 98 209 0 09 Nov 2015
Neural Module Networks Jacob Andreas Marcus Rohrbach Trevor Darrell Dan Klein CoGe 177 1,079 0 09 Nov 2015
Generating Images from Captions with Attention Elman Mansimov Emilio Parisotto Jimmy Lei Ba Ruslan Salakhutdinov VLM 102 457 0 09 Nov 2015
Explicit Knowledge-based Reasoning for Visual Question Answering Peng Wang Qi Wu Chunhua Shen Anton Van Den Hengel A. Dick 91 261 0 09 Nov 2015
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations Felix Hill Antoine Bordes S. Chopra Jason Weston RALM 195 638 0 07 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions Junhua Mao Jonathan Huang Alexander Toshev Oana-Maria Camburu Alan Yuille Kevin Patrick Murphy ObjD 144 1,362 0 07 Nov 2015
Stacked Attention Networks for Image Question Answering Zichao Yang Xiaodong He Jianfeng Gao Li Deng Alex Smola BDL 184 1,889 0 07 Nov 2015
Deep Kernel Learning A. Wilson Zhiting Hu Ruslan Salakhutdinov Eric Xing BDL 303 893 0 06 Nov 2015
RATM: Recurrent Attentive Tracking Model Samira Ebrahimi Kahou Vincent Michalski Roland Memisevic 87 84 0 29 Oct 2015
On End-to-End Program Generation from User Intention by Deep Neural Networks Lili Mou Rui Men Ge Li Lu Zhang Zhi Jin 71 46 0 25 Oct 2015
Generic decoding of seen and imagined objects using hierarchical visual features T. Horikawa Y. Kamitani 50 454 0 22 Oct 2015
Multilingual Image Description with Neural Sequence Models Desmond Elliott Stella Frank Eva Hasler VLM 145 76 0 15 Oct 2015
A Diversity-Promoting Objective Function for Neural Conversation Models Jiwei Li Michel Galley Chris Brockett Jianfeng Gao W. Dolan 166 2,407 0 11 Oct 2015
SentiCap: Generating Image Descriptions with Sentiments A. Mathews Lexing Xie Xuming He 110 222 0 06 Oct 2015
Learning Wake-Sleep Recurrent Attention Models Jimmy Ba Roger C. Grosse Ruslan Salakhutdinov B. Frey BDL 97 65 0 22 Sep 2015
Reasoning about Entailment with Neural Attention Tim Rocktaschel Edward Grefenstette Karl Moritz Hermann Tomás Kociský Phil Blunsom NAI 95 764 0 22 Sep 2015
Recurrent Spatial Transformer Networks Søren Kaae Sønderby C. Sønderby Lars Maaløe Ole Winther ViT 67 48 0 17 Sep 2015
Guiding Long-Short Term Memory for Image Caption Generation Xu Jia E. Gavves Basura Fernando Tinne Tuytelaars VLM 70 101 0 16 Sep 2015
What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment Hongyuan Mei Joey Tianyi Zhou Matthew R. Walter 104 290 0 02 Sep 2015
End-to-End Attention-based Large Vocabulary Speech Recognition Dzmitry Bahdanau J. Chorowski Dmitriy Serdyuk Philemon Brakel Yoshua Bengio 159 1,152 0 18 Aug 2015
Effective Approaches to Attention-based Neural Machine Translation Thang Luong Hieu H. Pham Christopher D. Manning 507 7,984 0 17 Aug 2015
Listen, Attend and Spell William Chan Navdeep Jaitly Quoc V. Le Oriol Vinyals RALM 177 2,273 0 05 Aug 2015
Artificial Neural Networks Applied to Taxi Destination Prediction A. D. Brébisson Étienne Simon Alex Auvolat Pascal Vincent Yoshua Bengio 106 187 0 31 Jul 2015
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos Serena Yeung Olga Russakovsky Ning Jin Mykhaylo Andriluka Greg Mori Li Fei-Fei VLM 113 441 0 21 Jul 2015
Describing Multimedia Content using Attention-based Encoder--Decoder Networks Kyunghyun Cho Aaron Courville Yoshua Bengio 95 413 0 04 Jul 2015
Attention-Based Models for Speech Recognition J. Chorowski Dzmitry Bahdanau Dmitriy Serdyuk Kyunghyun Cho Yoshua Bengio 151 2,614 0 24 Jun 2015
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing A. Kumar Ozan Irsoy Peter Ondruska Mohit Iyyer James Bradbury Ishaan Gulrajani Victor Zhong Romain Paulus R. Socher 177 1,183 0 24 Jun 2015
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books Yukun Zhu Ryan Kiros R. Zemel Ruslan Salakhutdinov R. Urtasun Antonio Torralba Sanja Fidler 197 2,558 0 22 Jun 2015
Aligning where to see and what to tell: image caption with region-based attention and scene factorization Junqi Jin Kun Fu Runpeng Cui Fei Sha Changshui Zhang 93 117 0 20 Jun 2015
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting Xingjian Shi Zhourong Chen Hao Wang Dit-Yan Yeung W. Wong W. Woo 764 8,046 0 13 Jun 2015
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences Hongyuan Mei Joey Tianyi Zhou Matthew R. Walter LM&Ro 116 244 0 12 Jun 2015
Spatial Transformer Networks Max Jaderberg Karen Simonyan Andrew Zisserman Koray Kavukcuoglu 397 7,416 0 05 Jun 2015
The Long-Short Story of Movie Description Anna Rohrbach Marcus Rohrbach Bernt Schiele VLM 77 111 0 04 Jun 2015
What value do explicit high level concepts have in vision to language problems? Qi Wu Chunhua Shen Lingqiao Liu A. Dick Anton Van Den Hengel 89 444 0 03 Jun 2015
A Hierarchical Neural Autoencoder for Paragraphs and Documents Jiwei Li Minh-Thang Luong Dan Jurafsky BDL 137 605 0 02 Jun 2015