ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
Yin and Yang: Balancing and Answering Binary Visual Questions
Yin and Yang: Balancing and Answering Binary Visual Questions
Peng Zhang
Yash Goyal
D. Summers-Stay
Dhruv Batra
Devi Parikh
CoGe
116
352
0
16 Nov 2015
Sherlock: Scalable Fact Learning in Images
Sherlock: Scalable Fact Learning in Images
Mohamed Elhoseiny
Scott D. Cohen
W. Chang
Brian L. Price
Ahmed Elgammal
59
26
0
16 Nov 2015
Neural Programmer: Inducing Latent Programs with Gradient Descent
Neural Programmer: Inducing Latent Programs with Gradient Descent
Arvind Neelakantan
Quoc V. Le
Ilya Sutskever
ODL
129
262
0
16 Nov 2015
Uncovering Temporal Context for Video Question and Answering
Uncovering Temporal Context for Video Question and Answering
Linchao Zhu
Zhongwen Xu
Yi Yang
Alexander G. Hauptmann
BDL
90
45
0
15 Nov 2015
Oracle performance for visual captioning
Oracle performance for visual captioning
L. Yao
Nicolas Ballas
Kyunghyun Cho
John R. Smith
Yoshua Bengio
VLM
111
8
0
14 Nov 2015
Reversible Recursive Instance-level Object Segmentation
Reversible Recursive Instance-level Object Segmentation
Xiaodan Liang
Yunchao Wei
Xiaohui Shen
Zequn Jie
Jiashi Feng
Liang Lin
Shuicheng Yan
SSegISeg
67
59
0
14 Nov 2015
Semantic Object Parsing with Local-Global Long Short-Term Memory
Semantic Object Parsing with Local-Global Long Short-Term Memory
Xiaodan Liang
Xiaohui Shen
Donglai Xiang
Jiashi Feng
Liang Lin
Shuicheng Yan
85
185
0
14 Nov 2015
Natural Language Object Retrieval
Natural Language Object Retrieval
Ronghang Hu
Huazhe Xu
Marcus Rohrbach
Jiashi Feng
Kate Saenko
Trevor Darrell
ObjD
147
555
0
13 Nov 2015
Action Recognition using Visual Attention
Action Recognition using Visual Attention
Shikhar Sharma
Ryan Kiros
Ruslan Salakhutdinov
102
667
0
12 Nov 2015
Deep Gaussian Conditional Random Field Network: A Model-based Deep
  Network for Discriminative Denoising
Deep Gaussian Conditional Random Field Network: A Model-based Deep Network for Discriminative Denoising
Raviteja Vemulapalli
Oncel Tuzel
Ming-Yuan Liu
80
70
0
12 Nov 2015
Hand-Object Interaction and Precise Localization in Transitive Action
  Recognition
Hand-Object Interaction and Precise Localization in Transitive Action Recognition
Amir Rosenfeld
S. Ullman
68
8
0
12 Nov 2015
Grounding of Textual Phrases in Images by Reconstruction
Grounding of Textual Phrases in Images by Reconstruction
Anna Rohrbach
Marcus Rohrbach
Ronghang Hu
Trevor Darrell
Bernt Schiele
90
497
0
12 Nov 2015
Generative Concatenative Nets Jointly Learn to Write and Classify
  Reviews
Generative Concatenative Nets Jointly Learn to Write and Classify Reviews
Zachary Chase Lipton
Sharad Vikram
Julian McAuley
BDL
115
33
0
11 Nov 2015
Visual7W: Grounded Question Answering in Images
Visual7W: Grounded Question Answering in Images
Yuke Zhu
Oliver Groth
Michael S. Bernstein
Li Fei-Fei
166
891
0
11 Nov 2015
Attention to Scale: Scale-aware Semantic Image Segmentation
Attention to Scale: Scale-aware Semantic Image Segmentation
Liang-Chieh Chen
Yi Yang
Jiang Wang
Wei Xu
Alan Yuille
SSeg
162
1,322
0
10 Nov 2015
Detecting events and key actors in multi-person videos
Detecting events and key actors in multi-person videos
Vignesh Ramanathan
Jonathan Huang
Sami Abu-El-Haija
Alexander N. Gorban
Kevin Patrick Murphy
Li Fei-Fei
98
209
0
09 Nov 2015
Neural Module Networks
Neural Module Networks
Jacob Andreas
Marcus Rohrbach
Trevor Darrell
Dan Klein
CoGe
177
1,079
0
09 Nov 2015
Generating Images from Captions with Attention
Generating Images from Captions with Attention
Elman Mansimov
Emilio Parisotto
Jimmy Lei Ba
Ruslan Salakhutdinov
VLM
102
457
0
09 Nov 2015
Explicit Knowledge-based Reasoning for Visual Question Answering
Explicit Knowledge-based Reasoning for Visual Question Answering
Peng Wang
Qi Wu
Chunhua Shen
Anton Van Den Hengel
A. Dick
91
261
0
09 Nov 2015
The Goldilocks Principle: Reading Children's Books with Explicit Memory
  Representations
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
Felix Hill
Antoine Bordes
S. Chopra
Jason Weston
RALM
195
638
0
07 Nov 2015
Generation and Comprehension of Unambiguous Object Descriptions
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
144
1,362
0
07 Nov 2015
Stacked Attention Networks for Image Question Answering
Stacked Attention Networks for Image Question Answering
Zichao Yang
Xiaodong He
Jianfeng Gao
Li Deng
Alex Smola
BDL
184
1,889
0
07 Nov 2015
Deep Kernel Learning
Deep Kernel Learning
A. Wilson
Zhiting Hu
Ruslan Salakhutdinov
Eric Xing
BDL
303
893
0
06 Nov 2015
RATM: Recurrent Attentive Tracking Model
RATM: Recurrent Attentive Tracking Model
Samira Ebrahimi Kahou
Vincent Michalski
Roland Memisevic
87
84
0
29 Oct 2015
On End-to-End Program Generation from User Intention by Deep Neural
  Networks
On End-to-End Program Generation from User Intention by Deep Neural Networks
Lili Mou
Rui Men
Ge Li
Lu Zhang
Zhi Jin
71
46
0
25 Oct 2015
Generic decoding of seen and imagined objects using hierarchical visual
  features
Generic decoding of seen and imagined objects using hierarchical visual features
T. Horikawa
Y. Kamitani
50
454
0
22 Oct 2015
Multilingual Image Description with Neural Sequence Models
Multilingual Image Description with Neural Sequence Models
Desmond Elliott
Stella Frank
Eva Hasler
VLM
145
76
0
15 Oct 2015
A Diversity-Promoting Objective Function for Neural Conversation Models
A Diversity-Promoting Objective Function for Neural Conversation Models
Jiwei Li
Michel Galley
Chris Brockett
Jianfeng Gao
W. Dolan
166
2,407
0
11 Oct 2015
SentiCap: Generating Image Descriptions with Sentiments
SentiCap: Generating Image Descriptions with Sentiments
A. Mathews
Lexing Xie
Xuming He
110
222
0
06 Oct 2015
Learning Wake-Sleep Recurrent Attention Models
Learning Wake-Sleep Recurrent Attention Models
Jimmy Ba
Roger C. Grosse
Ruslan Salakhutdinov
B. Frey
BDL
97
65
0
22 Sep 2015
Reasoning about Entailment with Neural Attention
Reasoning about Entailment with Neural Attention
Tim Rocktaschel
Edward Grefenstette
Karl Moritz Hermann
Tomás Kociský
Phil Blunsom
NAI
95
764
0
22 Sep 2015
Recurrent Spatial Transformer Networks
Recurrent Spatial Transformer Networks
Søren Kaae Sønderby
C. Sønderby
Lars Maaløe
Ole Winther
ViT
67
48
0
17 Sep 2015
Guiding Long-Short Term Memory for Image Caption Generation
Guiding Long-Short Term Memory for Image Caption Generation
Xu Jia
E. Gavves
Basura Fernando
Tinne Tuytelaars
VLM
70
101
0
16 Sep 2015
What to talk about and how? Selective Generation using LSTMs with
  Coarse-to-Fine Alignment
What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment
Hongyuan Mei
Joey Tianyi Zhou
Matthew R. Walter
104
290
0
02 Sep 2015
End-to-End Attention-based Large Vocabulary Speech Recognition
End-to-End Attention-based Large Vocabulary Speech Recognition
Dzmitry Bahdanau
J. Chorowski
Dmitriy Serdyuk
Philemon Brakel
Yoshua Bengio
159
1,152
0
18 Aug 2015
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
507
7,984
0
17 Aug 2015
Listen, Attend and Spell
Listen, Attend and Spell
William Chan
Navdeep Jaitly
Quoc V. Le
Oriol Vinyals
RALM
177
2,273
0
05 Aug 2015
Artificial Neural Networks Applied to Taxi Destination Prediction
Artificial Neural Networks Applied to Taxi Destination Prediction
A. D. Brébisson
Étienne Simon
Alex Auvolat
Pascal Vincent
Yoshua Bengio
106
187
0
31 Jul 2015
Every Moment Counts: Dense Detailed Labeling of Actions in Complex
  Videos
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
Serena Yeung
Olga Russakovsky
Ning Jin
Mykhaylo Andriluka
Greg Mori
Li Fei-Fei
VLM
113
441
0
21 Jul 2015
Describing Multimedia Content using Attention-based Encoder--Decoder
  Networks
Describing Multimedia Content using Attention-based Encoder--Decoder Networks
Kyunghyun Cho
Aaron Courville
Yoshua Bengio
95
413
0
04 Jul 2015
Attention-Based Models for Speech Recognition
Attention-Based Models for Speech Recognition
J. Chorowski
Dzmitry Bahdanau
Dmitriy Serdyuk
Kyunghyun Cho
Yoshua Bengio
151
2,614
0
24 Jun 2015
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
A. Kumar
Ozan Irsoy
Peter Ondruska
Mohit Iyyer
James Bradbury
Ishaan Gulrajani
Victor Zhong
Romain Paulus
R. Socher
177
1,183
0
24 Jun 2015
Aligning Books and Movies: Towards Story-like Visual Explanations by
  Watching Movies and Reading Books
Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books
Yukun Zhu
Ryan Kiros
R. Zemel
Ruslan Salakhutdinov
R. Urtasun
Antonio Torralba
Sanja Fidler
197
2,558
0
22 Jun 2015
Aligning where to see and what to tell: image caption with region-based
  attention and scene factorization
Aligning where to see and what to tell: image caption with region-based attention and scene factorization
Junqi Jin
Kun Fu
Runpeng Cui
Fei Sha
Changshui Zhang
93
117
0
20 Jun 2015
Convolutional LSTM Network: A Machine Learning Approach for
  Precipitation Nowcasting
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Xingjian Shi
Zhourong Chen
Hao Wang
Dit-Yan Yeung
W. Wong
W. Woo
764
8,046
0
13 Jun 2015
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to
  Action Sequences
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences
Hongyuan Mei
Joey Tianyi Zhou
Matthew R. Walter
LM&Ro
116
244
0
12 Jun 2015
Spatial Transformer Networks
Spatial Transformer Networks
Max Jaderberg
Karen Simonyan
Andrew Zisserman
Koray Kavukcuoglu
397
7,416
0
05 Jun 2015
The Long-Short Story of Movie Description
The Long-Short Story of Movie Description
Anna Rohrbach
Marcus Rohrbach
Bernt Schiele
VLM
77
111
0
04 Jun 2015
What value do explicit high level concepts have in vision to language
  problems?
What value do explicit high level concepts have in vision to language problems?
Qi Wu
Chunhua Shen
Lingqiao Liu
A. Dick
Anton Van Den Hengel
89
444
0
03 Jun 2015
A Hierarchical Neural Autoencoder for Paragraphs and Documents
A Hierarchical Neural Autoencoder for Paragraphs and Documents
Jiwei Li
Minh-Thang Luong
Dan Jurafsky
BDL
137
605
0
02 Jun 2015
Previous
123...697071
Next