Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,509 papers shown
Title
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
27
494
0
12 Nov 2015
Generative Concatenative Nets Jointly Learn to Write and Classify Reviews
Zachary Chase Lipton
Sharad Vikram
Julian McAuley
BDL
33
32
0
11 Nov 2015
Visual7W: Grounded Question Answering in Images
Yuke Zhu
Oliver Groth
Michael S. Bernstein
Li Fei-Fei
44
871
0
11 Nov 2015
Attention to Scale: Scale-aware Semantic Image Segmentation
Liang-Chieh Chen
Yi Yang
Jiang Wang
Wei Xu
Alan Yuille
SSeg
54
1,316
0
10 Nov 2015
Detecting events and key actors in multi-person videos
Vignesh Ramanathan
Jonathan Huang
Sami Abu-El-Haija
Alexander N. Gorban
Kevin Patrick Murphy
Li Fei-Fei
24
208
0
09 Nov 2015
Neural Module Networks
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
CoGe
31
1,062
0
09 Nov 2015
Generating Images from Captions with Attention
Elman Mansimov
Emilio Parisotto
Jimmy Lei Ba
Ruslan Salakhutdinov
VLM
43
449
0
09 Nov 2015
Explicit Knowledge-based Reasoning for Visual Question Answering
Peng Wang
Qi Wu
Chunhua Shen
Anton Van Den Hengel
A. Dick
39
257
0
09 Nov 2015
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
Felix Hill
Antoine Bordes
S. Chopra
Jason Weston
RALM
33
633
0
07 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
54
1,314
0
07 Nov 2015
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
59
1,867
0
07 Nov 2015
Deep Kernel Learning
A. Wilson
Zhiting Hu
Ruslan Salakhutdinov
Eric Xing
BDL
61
872
0
06 Nov 2015
RATM: Recurrent Attentive Tracking Model
Samira Ebrahimi Kahou
Vincent Michalski
Roland Memisevic
40
84
0
29 Oct 2015
On End-to-End Program Generation from User Intention by Deep Neural Networks
Lili Mou
Rui Men
Ge Li
Lu Zhang
Zhi Jin
26
46
0
25 Oct 2015
Generic decoding of seen and imagined objects using hierarchical visual features
T. Horikawa
Y. Kamitani
17
443
0
22 Oct 2015
Multilingual Image Description with Neural Sequence Models
Desmond Elliott
Stella Frank
Eva Hasler
VLM
22
75
0
15 Oct 2015
A Diversity-Promoting Objective Function for Neural Conversation Models
Jiwei Li
Michel Galley
Chris Brockett
Jianfeng Gao
W. Dolan
56
2,365
0
11 Oct 2015
SentiCap: Generating Image Descriptions with Sentiments
A. Mathews
Lexing Xie
Xuming He
26
221
0
06 Oct 2015
Learning Wake-Sleep Recurrent Attention Models
Jimmy Ba
Roger C. Grosse
Ruslan Salakhutdinov
B. Frey
BDL
32
65
0
22 Sep 2015
Reasoning about Entailment with Neural Attention
Tim Rocktaschel
Edward Grefenstette
Karl Moritz Hermann
Tomás Kociský
Phil Blunsom
NAI
12
760
0
22 Sep 2015
Recurrent Spatial Transformer Networks
Søren Kaae Sønderby
C. Sønderby
Lars Maaløe
Ole Winther
ViT
22
48
0
17 Sep 2015
Guiding Long-Short Term Memory for Image Caption Generation
Xu Jia
E. Gavves
Basura Fernando
Tinne Tuytelaars
VLM
22
101
0
16 Sep 2015
What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment
Hongyuan Mei
Joey Tianyi Zhou
Matthew R. Walter
31
288
0
02 Sep 2015
End-to-End Attention-based Large Vocabulary Speech Recognition
Dzmitry Bahdanau
J. Chorowski
Dmitriy Serdyuk
Philemon Brakel
Yoshua Bengio
17
1,146
0
18 Aug 2015
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,926
0
17 Aug 2015
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
74
2,251
0
05 Aug 2015
Artificial Neural Networks Applied to Taxi Destination Prediction
A. D. Brébisson
Étienne Simon
Alex Auvolat
Pascal Vincent
Yoshua Bengio
19
185
0
31 Jul 2015
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
Serena Yeung
Olga Russakovsky
Ning Jin
Mykhaylo Andriluka
Greg Mori
Li Fei-Fei
VLM
42
436
0
21 Jul 2015
Describing Multimedia Content using Attention-based Encoder--Decoder Networks
Kyunghyun Cho
Aaron Courville
Yoshua Bengio
32
411
0
04 Jul 2015
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
43
2,598
0
24 Jun 2015
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
A. Kumar
Ozan Irsoy
Peter Ondruska
Mohit Iyyer
James Bradbury
Ishaan Gulrajani
Victor Zhong
Romain Paulus
R. Socher
54
1,175
0
24 Jun 2015
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
Yukun Zhu
Ryan Kiros
R. Zemel
Ruslan Salakhutdinov
R. Urtasun
Antonio Torralba
Sanja Fidler
57
2,518
0
22 Jun 2015
Aligning where to see and what to tell: image caption with region-based attention and scene factorization
Junqi Jin
Kun Fu
Runpeng Cui
Fei Sha
Changshui Zhang
34
117
0
20 Jun 2015
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Xingjian Shi
Zhourong Chen
Hao Wang
Dit-Yan Yeung
W. Wong
W. Woo
236
7,906
0
13 Jun 2015
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences
Hongyuan Mei
Joey Tianyi Zhou
Matthew R. Walter
LM&Ro
29
242
0
12 Jun 2015
Spatial Transformer Networks
Max Jaderberg
Karen Simonyan
Andrew Zisserman
Koray Kavukcuoglu
170
7,337
0
05 Jun 2015
The Long-Short Story of Movie Description
Anna Rohrbach
Marcus Rohrbach
Bernt Schiele
VLM
33
110
0
04 Jun 2015
What value do explicit high level concepts have in vision to language problems?
Qi Wu
Chunhua Shen
Lingqiao Liu
A. Dick
Anton Van Den Hengel
27
443
0
03 Jun 2015
A Hierarchical Neural Autoencoder for Paragraphs and Documents
Jiwei Li
Minh-Thang Luong
Dan Jurafsky
BDL
21
602
0
02 Jun 2015
Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions
Jimmy Ba
Kevin Swersky
Sanja Fidler
Ruslan Salakhutdinov
VLM
29
435
0
01 Jun 2015
Learning with hidden variables
Y. Roudi
Graham Taylor
42
16
0
01 Jun 2015
Learning to Answer Questions From Image Using Convolutional Neural Network
Lin Ma
Zhengdong Lu
Hang Li
27
261
0
01 Jun 2015
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
Haoyuan Gao
Junhua Mao
Jie Zhou
Zhiheng Huang
Lei Wang
Wenyuan Xu
32
496
0
21 May 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
J. Hockenmaier
Svetlana Lazebnik
83
2,006
0
19 May 2015
Visual Semantic Role Labeling
Saurabh Gupta
Jitendra Malik
29
403
0
17 May 2015
Exploring Nearest Neighbor Approaches for Image Captioning
Jacob Devlin
Saurabh Gupta
Ross B. Girshick
Margaret Mitchell
C. L. Zitnick
27
195
0
17 May 2015
Exploring Models and Data for Image Question Answering
Mengye Ren
Ryan Kiros
R. Zemel
32
711
0
08 May 2015
Jointly Modeling Embedding and Translation to Bridge Video and Language
Yingwei Pan
Tao Mei
Ting Yao
Houqiang Li
Y. Rui
41
535
0
07 May 2015
Interleaved Text/Image Deep Mining on a Large-Scale Radiology Database for Automated Image Interpretation
Hoo-Chang Shin
Le Lu
Lauren Kim
Ari Seff
Jianhua Yao
Ronald M. Summers
34
46
0
04 May 2015
Reinforcement Learning Neural Turing Machines - Revised
Wojciech Zaremba
Ilya Sutskever
21
165
0
04 May 2015
Previous
1
2
3
...
69
70
71
Next