Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
v1
v2
v3 (latest)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,520 papers shown
Title
Yin and Yang: Balancing and Answering Binary Visual Questions
Peng Zhang
Yash Goyal
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
116
352
0
16 Nov 2015
Sherlock: Scalable Fact Learning in Images
Mohamed Elhoseiny
Scott D. Cohen
W. Chang
Brian L. Price
Ahmed Elgammal
59
26
0
16 Nov 2015
Neural Programmer: Inducing Latent Programs with Gradient Descent
Arvind Neelakantan
Quoc V. Le
Ilya Sutskever
ODL
129
262
0
16 Nov 2015
Uncovering Temporal Context for Video Question and Answering
Linchao Zhu
Zhongwen Xu
Yi Yang
Alexander G. Hauptmann
BDL
90
45
0
15 Nov 2015
Oracle performance for visual captioning
L. Yao
Nicolas Ballas
Kyunghyun Cho
John R. Smith
Yoshua Bengio
VLM
111
8
0
14 Nov 2015
Reversible Recursive Instance-level Object Segmentation
Xiaodan Liang
Yunchao Wei
Xiaohui Shen
Zequn Jie
Jiashi Feng
Liang Lin
Shuicheng Yan
SSeg
ISeg
67
59
0
14 Nov 2015
Semantic Object Parsing with Local-Global Long Short-Term Memory
Xiaodan Liang
Xiaohui Shen
Donglai Xiang
Jiashi Feng
Liang Lin
Shuicheng Yan
85
185
0
14 Nov 2015
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
147
555
0
13 Nov 2015
Action Recognition using Visual Attention
Shikhar Sharma
Ryan Kiros
Ruslan Salakhutdinov
102
667
0
12 Nov 2015
Deep Gaussian Conditional Random Field Network: A Model-based Deep Network for Discriminative Denoising
Raviteja Vemulapalli
Oncel Tuzel
Ming-Yuan Liu
80
70
0
12 Nov 2015
Hand-Object Interaction and Precise Localization in Transitive Action Recognition
Amir Rosenfeld
S. Ullman
68
8
0
12 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
90
497
0
12 Nov 2015
Generative Concatenative Nets Jointly Learn to Write and Classify Reviews
Zachary Chase Lipton
Sharad Vikram
Julian McAuley
BDL
115
33
0
11 Nov 2015
Visual7W: Grounded Question Answering in Images
Yuke Zhu
Oliver Groth
Michael S. Bernstein
Li Fei-Fei
166
891
0
11 Nov 2015
Attention to Scale: Scale-aware Semantic Image Segmentation
Liang-Chieh Chen
Yi Yang
Jiang Wang
Wei Xu
Alan Yuille
SSeg
162
1,322
0
10 Nov 2015
Detecting events and key actors in multi-person videos
Vignesh Ramanathan
Jonathan Huang
Sami Abu-El-Haija
Alexander N. Gorban
Kevin Patrick Murphy
Li Fei-Fei
98
209
0
09 Nov 2015
Neural Module Networks
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
CoGe
177
1,079
0
09 Nov 2015
Generating Images from Captions with Attention
Elman Mansimov
Emilio Parisotto
Jimmy Lei Ba
Ruslan Salakhutdinov
VLM
102
457
0
09 Nov 2015
Explicit Knowledge-based Reasoning for Visual Question Answering
Peng Wang
Qi Wu
Chunhua Shen
Anton Van Den Hengel
A. Dick
91
261
0
09 Nov 2015
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
Felix Hill
Antoine Bordes
S. Chopra
Jason Weston
RALM
195
638
0
07 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
144
1,362
0
07 Nov 2015
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
184
1,889
0
07 Nov 2015
Deep Kernel Learning
A. Wilson
Zhiting Hu
Ruslan Salakhutdinov
Eric Xing
BDL
303
893
0
06 Nov 2015
RATM: Recurrent Attentive Tracking Model
Samira Ebrahimi Kahou
Vincent Michalski
Roland Memisevic
87
84
0
29 Oct 2015
On End-to-End Program Generation from User Intention by Deep Neural Networks
Lili Mou
Rui Men
Ge Li
Lu Zhang
Zhi Jin
71
46
0
25 Oct 2015
Generic decoding of seen and imagined objects using hierarchical visual features
T. Horikawa
Y. Kamitani
50
454
0
22 Oct 2015
Multilingual Image Description with Neural Sequence Models
Desmond Elliott
Stella Frank
Eva Hasler
VLM
145
76
0
15 Oct 2015
A Diversity-Promoting Objective Function for Neural Conversation Models
Jiwei Li
Michel Galley
Chris Brockett
Jianfeng Gao
W. Dolan
166
2,407
0
11 Oct 2015
SentiCap: Generating Image Descriptions with Sentiments
A. Mathews
Lexing Xie
Xuming He
110
222
0
06 Oct 2015
Learning Wake-Sleep Recurrent Attention Models
Jimmy Ba
Roger C. Grosse
Ruslan Salakhutdinov
B. Frey
BDL
97
65
0
22 Sep 2015
Reasoning about Entailment with Neural Attention
Tim Rocktaschel
Edward Grefenstette
Karl Moritz Hermann
Tomás Kociský
Phil Blunsom
NAI
95
764
0
22 Sep 2015
Recurrent Spatial Transformer Networks
Søren Kaae Sønderby
C. Sønderby
Lars Maaløe
Ole Winther
ViT
67
48
0
17 Sep 2015
Guiding Long-Short Term Memory for Image Caption Generation
Xu Jia
E. Gavves
Basura Fernando
Tinne Tuytelaars
VLM
70
101
0
16 Sep 2015
What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment
Hongyuan Mei
Joey Tianyi Zhou
Matthew R. Walter
104
290
0
02 Sep 2015
End-to-End Attention-based Large Vocabulary Speech Recognition
Dzmitry Bahdanau
J. Chorowski
Dmitriy Serdyuk
Philemon Brakel
Yoshua Bengio
159
1,152
0
18 Aug 2015
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
507
7,984
0
17 Aug 2015
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
177
2,273
0
05 Aug 2015
Artificial Neural Networks Applied to Taxi Destination Prediction
A. D. Brébisson
Étienne Simon
Alex Auvolat
Pascal Vincent
Yoshua Bengio
106
187
0
31 Jul 2015
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
Serena Yeung
Olga Russakovsky
Ning Jin
Mykhaylo Andriluka
Greg Mori
Li Fei-Fei
VLM
113
441
0
21 Jul 2015
Describing Multimedia Content using Attention-based Encoder--Decoder Networks
Kyunghyun Cho
Aaron Courville
Yoshua Bengio
95
413
0
04 Jul 2015
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
151
2,614
0
24 Jun 2015
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
A. Kumar
Ozan Irsoy
Peter Ondruska
Mohit Iyyer
James Bradbury
Ishaan Gulrajani
Victor Zhong
Romain Paulus
R. Socher
177
1,183
0
24 Jun 2015
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
Yukun Zhu
Ryan Kiros
R. Zemel
Ruslan Salakhutdinov
R. Urtasun
Antonio Torralba
Sanja Fidler
197
2,558
0
22 Jun 2015
Aligning where to see and what to tell: image caption with region-based attention and scene factorization
Junqi Jin
Kun Fu
Runpeng Cui
Fei Sha
Changshui Zhang
93
117
0
20 Jun 2015
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Xingjian Shi
Zhourong Chen
Hao Wang
Dit-Yan Yeung
W. Wong
W. Woo
764
8,046
0
13 Jun 2015
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences
Hongyuan Mei
Joey Tianyi Zhou
Matthew R. Walter
LM&Ro
116
244
0
12 Jun 2015
Spatial Transformer Networks
Max Jaderberg
Karen Simonyan
Andrew Zisserman
Koray Kavukcuoglu
397
7,416
0
05 Jun 2015
The Long-Short Story of Movie Description
Anna Rohrbach
Marcus Rohrbach
Bernt Schiele
VLM
77
111
0
04 Jun 2015
What value do explicit high level concepts have in vision to language problems?
Qi Wu
Chunhua Shen
Lingqiao Liu
A. Dick
Anton Van Den Hengel
89
444
0
03 Jun 2015
A Hierarchical Neural Autoencoder for Paragraphs and Documents
Jiwei Li
Minh-Thang Luong
Dan Jurafsky
BDL
137
605
0
02 Jun 2015
Previous
1
2
3
...
69
70
71
Next