Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
v1
v2
v3 (latest)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,520 papers shown
Title
Improving the Performance of Neural Machine Translation Involving Morphologically Rich Languages
Hans Krupakar
R. S. Milton
91
16
0
07 Dec 2016
Spatially Adaptive Computation Time for Residual Networks
Michael Figurnov
Maxwell D. Collins
Yukun Zhu
Li Zhang
Jonathan Huang
Dmitry Vetrov
Ruslan Salakhutdinov
75
351
0
07 Dec 2016
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
146
1,458
0
06 Dec 2016
Condensed Memory Networks for Clinical Diagnostic Inferencing
Aaditya (Adi) Prakash
Siyuan Zhao
Sadid A. Hasan
Vivek Datla
Kathy Lee
Ashequl Qadir
Joey Liu
Oladimeji Farri
69
103
0
06 Dec 2016
Learning to Detect Multiple Photographic Defects
Ning Yu
Xiaohui Shen
Zhe Lin
R. Měch
Connelly Barnes
76
14
0
06 Dec 2016
ImageNet pre-trained models with batch normalization
Marcel Simon
E. Rodner
Joachim Denzler
VLM
SSeg
104
166
0
05 Dec 2016
Areas of Attention for Image Captioning
M. Pedersoli
Thomas Lucas
Cordelia Schmid
Jakob Verbeek
117
206
0
03 Dec 2016
Parameter Compression of Recurrent Neural Networks and Degradation of Short-term Memory
Jonathan A. Cox
22
5
0
02 Dec 2016
Guided Open Vocabulary Image Captioning with Constrained Beam Search
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
97
238
0
02 Dec 2016
Self-critical Sequence Training for Image Captioning
Steven J. Rennie
E. Marcheret
Youssef Mroueh
Jerret Ross
Vaibhava Goel
149
1,898
0
02 Dec 2016
Temporal Attention-Gated Model for Robust Sequence Classification
Wenjie Pei
T. Baltrušaitis
David Tax
Louis-Philippe Morency
82
89
0
01 Dec 2016
Improved Image Captioning via Policy Gradient optimization of SPIDEr
Siqi Liu
Zhenhai Zhu
Ning Ye
S. Guadarrama
Kevin Patrick Murphy
181
446
0
01 Dec 2016
Video Captioning with Multi-Faceted Attention
Xiang Long
Chuang Gan
Gerard de Melo
87
88
0
01 Dec 2016
Sync-DRAW: Automatic Video Generation using Deep Recurrent Attentive Architectures
Gaurav Mittal
Tanya Marwah
V. Balasubramanian
VGen
DiffM
101
67
0
30 Nov 2016
Modeling Relationships in Referential Expressions with Compositional Modular Networks
Ronghang Hu
Marcus Rohrbach
Jacob Andreas
Trevor Darrell
Kate Saenko
84
407
0
30 Nov 2016
Attend in groups: a weakly-supervised deep learning framework for learning from web data
Bohan Zhuang
Lingqiao Liu
Yao Li
Chunhua Shen
Ian Reid
NoLa
69
89
0
30 Nov 2016
Context-aware Natural Language Generation with Recurrent Neural Networks
Jian Tang
Yifan Yang
Samuel Carton
Ming Zhang
Qiaozhu Mei
82
67
0
29 Nov 2016
Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model
Marcella Cornia
Lorenzo Baraldi
G. Serra
Rita Cucchiara
130
552
0
29 Nov 2016
Deep Quantization: Encoding Convolutional Activations with Deep Generative Model
Zhaofan Qiu
Ting Yao
Tao Mei
DRL
MQ
83
60
0
29 Nov 2016
Emergence of foveal image sampling from learning to attend in visual scenes
Brian Cheung
E. Weiss
Bruno A. Olshausen
89
39
0
28 Nov 2016
Hierarchical Boundary-Aware Neural Encoder for Video Captioning
Lorenzo Baraldi
C. Grana
Rita Cucchiara
82
192
0
28 Nov 2016
Attention-based Memory Selection Recurrent Network for Language Modeling
Da-Rong Liu
Shun-Po Chuang
Hung-yi Lee
RALM
KELM
45
5
0
26 Nov 2016
Neural Machine Translation with Latent Semantic of Image and Text
Joji Toyama
Masanori Misono
Masahiro Suzuki
Kotaro Nakayama
Y. Matsuo
134
14
0
25 Nov 2016
An Overview on Data Representation Learning: From Traditional Feature Learning to Recent Deep Learning
G. Zhong
Lina Wang
Junyu Dong
AI4TS
83
183
0
25 Nov 2016
Semantic Compositional Networks for Visual Captioning
Zhe Gan
Chuang Gan
Xiaodong He
Yunchen Pu
Kenneth Tran
Jianfeng Gao
Lawrence Carin
Li Deng
CoGe
114
427
0
23 Nov 2016
GuessWhat?! Visual object discovery through multi-modal dialogue
H. D. Vries
Florian Strub
A. Chandar
Olivier Pietquin
Hugo Larochelle
Aaron Courville
VLM
112
428
0
23 Nov 2016
Adaptive Feature Abstraction for Translating Video to Text
Yunchen Pu
Martin Renqiang Min
Zhe Gan
Lawrence Carin
72
14
0
23 Nov 2016
Recurrent Attention Models for Depth-Based Person Identification
Albert Haque
Alexandre Alahi
Li Fei-Fei
3DH
87
142
0
22 Nov 2016
GRAM: Graph-based Attention Model for Healthcare Representation Learning
Edward Choi
M. T. Bahadori
Le Song
Walter F. Stewart
Jimeng Sun
GNN
97
678
0
21 Nov 2016
Coherent Dialogue with Attention-based Language Models
Hongyuan Mei
Joey Tianyi Zhou
Matthew R. Walter
AuLLM
80
83
0
21 Nov 2016
Dense Captioning with Joint Inference and Visual Context
L. Yang
K. Tang
Jianchao Yang
Li Li
VLM
103
170
0
21 Nov 2016
Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues
Bryan A. Plummer
Arun Mallya
Christopher M. Cervantes
Julia Hockenmaier
Svetlana Lazebnik
142
189
0
21 Nov 2016
A Hierarchical Approach for Generating Descriptive Image Paragraphs
J. Krause
Justin Johnson
Ranjay Krishna
Li Fei-Fei
VLM
106
379
0
20 Nov 2016
Recurrent Memory Addressing for describing videos
A. Jain
Abhinav Agarwalla
Kumar Krishna Agrawal
Pabitra Mitra
62
10
0
20 Nov 2016
An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data
Sijie Song
Cuiling Lan
Junliang Xing
Wenjun Zeng
Jiaying Liu
202
991
0
18 Nov 2016
Cross Domain Knowledge Transfer for Person Re-identification
Qiqi Xiao
Kelei Cao
Haonan Chen
Fangyue Peng
Fangqiu Yi
91
18
0
18 Nov 2016
AutoScaler: Scale-Attention Networks for Visual Correspondence
Shenlong Wang
Linjie Luo
Ning Zhang
Jia Li
66
19
0
17 Nov 2016
SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
Long Chen
Hanwang Zhang
Jun Xiao
Liqiang Nie
Jian Shao
Wei Liu
Tat-Seng Chua
115
1,667
0
17 Nov 2016
Instance-aware Image and Sentence Matching with Selective Multimodal LSTM
Yan Huang
Wei Wang
Liang Wang
114
223
0
17 Nov 2016
DelugeNets: Deep Networks with Efficient and Flexible Cross-layer Information Inflows
Jason Kuen
Xiangfei Kong
G. Wang
Yap-Peng Tan
70
14
0
17 Nov 2016
Semantic Regularisation for Recurrent Image Annotation
Feng Liu
Tao Xiang
Timothy M. Hospedales
Wankou Yang
Changyin Sun
107
104
0
16 Nov 2016
A Semi-supervised Framework for Image Captioning
Wenhu Chen
Aurelien Lucchi
Thomas Hofmann
92
9
0
16 Nov 2016
The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives
Mohit Iyyer
Varun Manjunatha
Anupam Guha
Yogarshi Vyas
Jordan L. Boyd-Graber
Hal Daumé
L. Davis
87
101
0
16 Nov 2016
Diversity encouraged learning of unsupervised LSTM ensemble for neural activity video prediction
Yilin Song
J. Viventi
Yao Wang
AI4TS
54
2
0
15 Nov 2016
Hierarchical Object Detection with Deep Reinforcement Learning
Míriam Bellver
Xavier Giró-i-Nieto
F. Marqués
Jordi Torres
80
105
0
11 Nov 2016
Getting Started with Neural Models for Semantic Matching in Web Search
Kezban Dilek Onal
I. S. Altingövde
Pinar Senkul
Maarten de Rijke
VLM
3DV
63
9
0
08 Nov 2016
Memory-augmented Attention Modelling for Videos
Rasool Fakoor
Abdel-rahman Mohamed
Margaret Mitchell
S. B. Kang
Pushmeet Kohli
115
20
0
07 Nov 2016
Latent Attention For If-Then Program Synthesis
Xinyun Chen
Chang-rui Liu
E. C. Shin
Basel Alomair
Mingcheng Chen
83
70
0
07 Nov 2016
Hierarchical Question Answering for Long Documents
Eunsol Choi
D. Hewlett
Alexandre Lacoste
Illia Polosukhin
Jakob Uszkoreit
Jonathan Berant
RALM
99
168
0
06 Nov 2016
Boosting Image Captioning with Attributes
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
VLM
132
624
0
05 Nov 2016
Previous
1
2
3
...
64
65
66
...
69
70
71
Next