ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
Categorical Reparameterization with Gumbel-Softmax
Categorical Reparameterization with Gumbel-Softmax
Eric Jang
S. Gu
Ben Poole
BDL
384
5,402
0
03 Nov 2016
The Concrete Distribution: A Continuous Relaxation of Discrete Random
  Variables
The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables
Chris J. Maddison
A. Mnih
Yee Whye Teh
BDL
230
2,544
0
02 Nov 2016
Dual Attention Networks for Multimodal Reasoning and Matching
Dual Attention Networks for Multimodal Reasoning and Matching
Hyeonseob Nam
Jung-Woo Ha
Jeonghee Kim
134
670
0
02 Nov 2016
Phased LSTM: Accelerating Recurrent Network Training for Long or
  Event-based Sequences
Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences
Daniel Neil
Michael Pfeiffer
Shih-Chii Liu
AI4TS
76
448
0
29 Oct 2016
Professor Forcing: A New Algorithm for Training Recurrent Networks
Professor Forcing: A New Algorithm for Training Recurrent Networks
Alex Lamb
Anirudh Goyal
Ying Zhang
Saizheng Zhang
Aaron Courville
Yoshua Bengio
GAN
145
598
0
27 Oct 2016
Cross-Modal Scene Networks
Cross-Modal Scene Networks
Y. Aytar
Lluis Castrejon
Carl Vondrick
Hamed Pirsiavash
Antonio Torralba
SSL
89
114
0
27 Oct 2016
Can Active Memory Replace Attention?
Can Active Memory Replace Attention?
Lukasz Kaiser
Samy Bengio
91
59
0
27 Oct 2016
Jointly Learning to Align and Convert Graphemes to Phonemes with Neural
  Attention Models
Jointly Learning to Align and Convert Graphemes to Phonemes with Neural Attention Models
Shubham Toshniwal
Karen Livescu
57
41
0
20 Oct 2016
Lexicon Integrated CNN Models with Attention for Sentiment Analysis
Lexicon Integrated CNN Models with Attention for Sentiment Analysis
Bonggun Shin
Timothy Lee
Jinho Choi
57
113
0
20 Oct 2016
Using Fast Weights to Attend to the Recent Past
Using Fast Weights to Attend to the Recent Past
Jimmy Ba
Geoffrey E. Hinton
Volodymyr Mnih
Joel Z Leibo
Catalin Ionescu
107
273
0
20 Oct 2016
Learning Robust Video Synchronization without Annotations
Learning Robust Video Synchronization without Annotations
P. Wieschollek
Ido Freeman
Hendrik P. A. Lensch
32
7
0
19 Oct 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
122
50
0
17 Oct 2016
Recurrent 3D Attentional Networks for End-to-End Active Object
  Recognition
Recurrent 3D Attentional Networks for End-to-End Active Object Recognition
Min Liu
Yifei Shi
Lintao Zheng
Kai Xu
Hui Huang
Dinesh Manocha
3DPC
71
10
0
14 Oct 2016
Video Fill in the Blank with Merging LSTMs
Video Fill in the Blank with Merging LSTMs
Amir Mazaheri
Dong Zhang
M. Shah
75
18
0
13 Oct 2016
Generating captions without looking beyond objects
Generating captions without looking beyond objects
Hendrik Heuer
Christof Monz
A. Smeulders
50
17
0
12 Oct 2016
Attention and Anticipation in Fast Visual-Inertial Navigation
Attention and Anticipation in Fast Visual-Inertial Navigation
Luca Carlone
S. Karaman
91
78
0
11 Oct 2016
Latent Sequence Decompositions
Latent Sequence Decompositions
William Chan
Yu Zhang
Quoc V. Le
Navdeep Jaitly
115
62
0
10 Oct 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and
  Question Answering
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
157
231
0
10 Oct 2016
Understanding intermediate layers using linear classifier probes
Understanding intermediate layers using linear classifier probes
Guillaume Alain
Yoshua Bengio
FAtt
198
959
0
05 Oct 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Kushal Kafle
Christopher Kanan
OOD
119
244
0
05 Oct 2016
A Survey of Multi-View Representation Learning
A Survey of Multi-View Representation Learning
Yingming Li
Ming Yang
Zhongfei Zhang
AI4TS3DV
357
519
0
03 Oct 2016
Controlling Output Length in Neural Encoder-Decoders
Controlling Output Length in Neural Encoder-Decoders
Yuta Kikuchi
Graham Neubig
Ryohei Sasano
Hiroya Takamura
Manabu Okumura
84
244
0
30 Sep 2016
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Yunchen Pu
Zhe Gan
Ricardo Henao
Xin Yuan
Chunyuan Li
Andrew Stevens
Lawrence Carin
BDLCoGe
109
756
0
28 Sep 2016
Character Sequence Models for ColorfulWords
Character Sequence Models for ColorfulWords
Kazuya Kawakami
Chris Dyer
Bryan R. Routledge
Noah A. Smith
3DV
59
18
0
28 Sep 2016
Learning Language-Visual Embedding for Movie Understanding with
  Natural-Language
Learning Language-Visual Embedding for Movie Understanding with Natural-Language
Atousa Torabi
Niket Tandon
Leonid Sigal
81
98
0
26 Sep 2016
Visual Fashion-Product Search at SK Planet
Visual Fashion-Product Search at SK Planet
Taewan Kim
Seyeong Kim
Sangil Na
Hayoon Kim
Moonki Kim
Beyeongki Jeon
73
6
0
26 Sep 2016
Language as a Latent Variable: Discrete Generative Models for Sentence
  Compression
Language as a Latent Variable: Discrete Generative Models for Sentence Compression
Yishu Miao
Phil Blunsom
368
223
0
23 Sep 2016
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question
  Answering (FSVQA)
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA)
Andrew Shin
Yoshitaka Ushiku
Tatsuya Harada
83
15
0
21 Sep 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
  Challenge
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
138
856
0
21 Sep 2016
Enhanced LSTM for Natural Language Inference
Enhanced LSTM for Natural Language Inference
Qian Chen
Xiao-Dan Zhu
Zhenhua Ling
Si Wei
Hui Jiang
Diana Inkpen
LRMReLM
185
1,133
0
20 Sep 2016
Image-to-Markup Generation with Coarse-to-Fine Attention
Image-to-Markup Generation with Coarse-to-Fine Attention
Yuntian Deng
Anssi Kanervisto
Jeffrey Ling
Alexander M. Rush
66
230
0
16 Sep 2016
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent
  Trajectories
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories
Mark Harmon
Abdolghani Ebrahimi
P. Lucey
Diego Klabjan
GAN
78
20
0
15 Sep 2016
Multimodal Attention for Neural Machine Translation
Multimodal Attention for Neural Machine Translation
Ozan Caglayan
Loïc Barrault
Fethi Bougares
84
76
0
13 Sep 2016
Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing
Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing
J. Chorowski
Michal Zapotoczny
Paweł Rychlikowski
42
5
0
12 Sep 2016
The Role of Context Selection in Object Detection
The Role of Context Selection in Object Detection
Ruichi Yu
Xi Chen
Vlad I. Morariu
L. Davis
60
42
0
09 Sep 2016
Optimizing Recurrent Neural Networks Architectures under Time Constraints
Junqi Jin
Ziang Yan
Kun Fu
Nan Jiang
Changshui Zhang
75
2
0
29 Aug 2016
A Boundary Tilting Persepective on the Phenomenon of Adversarial
  Examples
A Boundary Tilting Persepective on the Phenomenon of Adversarial Examples
T. Tanay
Lewis D. Griffin
AAML
105
272
0
27 Aug 2016
Learning to generalize to new compositions in image understanding
Learning to generalize to new compositions in image understanding
Yuval Atzmon
Jonathan Berant
Vahid Kezami
Amir Globerson
Gal Chechik
82
67
0
27 Aug 2016
Title Generation for User Generated Videos
Title Generation for User Generated Videos
Kuo-Hao Zeng
Tseng-Hung Chen
Juan Carlos Niebles
Min Sun
79
69
0
25 Aug 2016
Context Gates for Neural Machine Translation
Context Gates for Neural Machine Translation
Zhaopeng Tu
Yang Liu
Zhengdong Lu
Xiaohua Liu
Hang Li
75
138
0
22 Aug 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
Y. Tan
Chee Seng Chan
VLM
114
29
0
20 Aug 2016
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse
  Time Attention Mechanism
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism
Edward Choi
M. T. Bahadori
Joshua A. Kulas
A. Schuetz
Walter F. Stewart
Jimeng Sun
AI4TS
140
1,254
0
19 Aug 2016
Modeling Human Reading with Neural Attention
Modeling Human Reading with Neural Attention
Michael Hahn
Frank Keller
90
56
0
19 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Yusuke Sugano
Andreas Bulling
86
68
0
18 Aug 2016
Temporal Attention Model for Neural Machine Translation
Temporal Attention Model for Neural Machine Translation
B. Sankaran
Haitao Mi
Yaser Al-Onaizan
Abe Ittycheriah
74
62
0
09 Aug 2016
End-to-End Localization and Ranking for Relative Attributes
End-to-End Localization and Ranking for Relative Attributes
Krishna Kumar Singh
Yong Jae Lee
144
76
0
09 Aug 2016
Learning Online Alignments with Continuous Rewards Policy Gradient
Learning Online Alignments with Continuous Rewards Policy Gradient
Yuping Luo
Chung-Cheng Chiu
Navdeep Jaitly
Ilya Sutskever
OffRL
78
46
0
03 Aug 2016
Modeling Context Between Objects for Referring Expression Understanding
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
79
161
0
01 Aug 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
135
1,281
0
31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
162
1,931
0
29 Jul 2016
Previous
123...656667...697071
Next