ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXivPDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,510 papers shown
Title
Learning Robust Video Synchronization without Annotations
Learning Robust Video Synchronization without Annotations
P. Wieschollek
Ido Freeman
Hendrik P. A. Lensch
9
7
0
19 Oct 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
35
50
0
17 Oct 2016
Recurrent 3D Attentional Networks for End-to-End Active Object
  Recognition
Recurrent 3D Attentional Networks for End-to-End Active Object Recognition
Min Liu
Yifei Shi
Lintao Zheng
Kai Xu
Hui Huang
Dinesh Manocha
3DPC
21
10
0
14 Oct 2016
Video Fill in the Blank with Merging LSTMs
Video Fill in the Blank with Merging LSTMs
Amir Mazaheri
Dong-Ming Zhang
M. Shah
32
18
0
13 Oct 2016
Generating captions without looking beyond objects
Generating captions without looking beyond objects
Hendrik Heuer
Christof Monz
A. Smeulders
25
17
0
12 Oct 2016
Attention and Anticipation in Fast Visual-Inertial Navigation
Attention and Anticipation in Fast Visual-Inertial Navigation
Luca Carlone
S. Karaman
32
77
0
11 Oct 2016
Latent Sequence Decompositions
Latent Sequence Decompositions
William Chan
Yu Zhang
Quoc V. Le
Navdeep Jaitly
24
62
0
10 Oct 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and
  Question Answering
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
14
230
0
10 Oct 2016
Understanding intermediate layers using linear classifier probes
Understanding intermediate layers using linear classifier probes
Guillaume Alain
Yoshua Bengio
FAtt
53
900
0
05 Oct 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Kushal Kafle
Christopher Kanan
OOD
33
235
0
05 Oct 2016
A Survey of Multi-View Representation Learning
A Survey of Multi-View Representation Learning
Yingming Li
Ming Yang
Zhongfei Zhang
AI4TS
3DV
37
509
0
03 Oct 2016
Controlling Output Length in Neural Encoder-Decoders
Controlling Output Length in Neural Encoder-Decoders
Yuta Kikuchi
Graham Neubig
Ryohei Sasano
Hiroya Takamura
Manabu Okumura
30
242
0
30 Sep 2016
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Yunchen Pu
Zhe Gan
Ricardo Henao
Xin Yuan
Chunyuan Li
Andrew Stevens
Lawrence Carin
BDL
CoGe
45
746
0
28 Sep 2016
Character Sequence Models for ColorfulWords
Character Sequence Models for ColorfulWords
Kazuya Kawakami
Chris Dyer
Bryan R. Routledge
Noah A. Smith
3DV
28
17
0
28 Sep 2016
Learning Language-Visual Embedding for Movie Understanding with
  Natural-Language
Learning Language-Visual Embedding for Movie Understanding with Natural-Language
Atousa Torabi
Niket Tandon
Leonid Sigal
22
97
0
26 Sep 2016
Visual Fashion-Product Search at SK Planet
Visual Fashion-Product Search at SK Planet
Taewan Kim
Seyeong Kim
Sangil Na
Hayoon Kim
Moonki Kim
Beyeongki Jeon
11
6
0
26 Sep 2016
Language as a Latent Variable: Discrete Generative Models for Sentence
  Compression
Language as a Latent Variable: Discrete Generative Models for Sentence Compression
Yishu Miao
Phil Blunsom
201
223
0
23 Sep 2016
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question
  Answering (FSVQA)
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA)
Andrew Shin
Yoshitaka Ushiku
Tatsuya Harada
52
14
0
21 Sep 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
  Challenge
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
30
851
0
21 Sep 2016
Enhanced LSTM for Natural Language Inference
Enhanced LSTM for Natural Language Inference
Qian Chen
Xiao-Dan Zhu
Zhenhua Ling
Si Wei
Hui Jiang
Diana Inkpen
LRM
ReLM
41
1,127
0
20 Sep 2016
Image-to-Markup Generation with Coarse-to-Fine Attention
Image-to-Markup Generation with Coarse-to-Fine Attention
Yuntian Deng
Anssi Kanervisto
Jeffrey Ling
Alexander M. Rush
19
226
0
16 Sep 2016
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent
  Trajectories
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories
Mark Harmon
Abdolghani Ebrahimi
P. Lucey
Diego Klabjan
GAN
25
18
0
15 Sep 2016
Multimodal Attention for Neural Machine Translation
Multimodal Attention for Neural Machine Translation
Ozan Caglayan
Loïc Barrault
Fethi Bougares
34
75
0
13 Sep 2016
Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing
Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing
J. Chorowski
Michal Zapotoczny
Paweł Rychlikowski
27
5
0
12 Sep 2016
The Role of Context Selection in Object Detection
The Role of Context Selection in Object Detection
Ruichi Yu
Xi Chen
Vlad I. Morariu
L. Davis
22
42
0
09 Sep 2016
Optimizing Recurrent Neural Networks Architectures under Time Constraints
Junqi Jin
Ziang Yan
Kun Fu
Nan Jiang
Changshui Zhang
24
2
0
29 Aug 2016
A Boundary Tilting Persepective on the Phenomenon of Adversarial
  Examples
A Boundary Tilting Persepective on the Phenomenon of Adversarial Examples
T. Tanay
Lewis D. Griffin
AAML
25
270
0
27 Aug 2016
Learning to generalize to new compositions in image understanding
Learning to generalize to new compositions in image understanding
Yuval Atzmon
Jonathan Berant
Vahid Kezami
Amir Globerson
Gal Chechik
26
67
0
27 Aug 2016
Title Generation for User Generated Videos
Title Generation for User Generated Videos
Kuo-Hao Zeng
Tseng-Hung Chen
Juan Carlos Niebles
Min Sun
35
69
0
25 Aug 2016
Context Gates for Neural Machine Translation
Context Gates for Neural Machine Translation
Zhaopeng Tu
Yang Liu
Zhengdong Lu
Xiaohua Liu
Hang Li
29
137
0
22 Aug 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
Y. Tan
Chee Seng Chan
VLM
22
29
0
20 Aug 2016
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse
  Time Attention Mechanism
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism
Edward Choi
M. T. Bahadori
Joshua A. Kulas
A. Schuetz
Walter F. Stewart
Jimeng Sun
AI4TS
60
1,232
0
19 Aug 2016
Modeling Human Reading with Neural Attention
Modeling Human Reading with Neural Attention
Michael Hahn
Frank Keller
28
55
0
19 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Yusuke Sugano
Andreas Bulling
24
68
0
18 Aug 2016
Temporal Attention Model for Neural Machine Translation
Temporal Attention Model for Neural Machine Translation
B. Sankaran
Haitao Mi
Yaser Al-Onaizan
Abe Ittycheriah
17
62
0
09 Aug 2016
End-to-End Localization and Ranking for Relative Attributes
End-to-End Localization and Ranking for Relative Attributes
Krishna Kumar Singh
Yong Jae Lee
27
76
0
09 Aug 2016
Learning Online Alignments with Continuous Rewards Policy Gradient
Learning Online Alignments with Continuous Rewards Policy Gradient
Yuping Luo
Chung-Cheng Chiu
Navdeep Jaitly
Ilya Sutskever
OffRL
18
46
0
03 Aug 2016
Modeling Context Between Objects for Referring Expression Understanding
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
40
145
0
01 Aug 2016
Modeling Context in Referring Expressions
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
53
1,233
0
31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
36
1,888
0
29 Jul 2016
Salient Object Subitizing
Salient Object Subitizing
Jianming Zhang
Shugao Ma
M. Sameki
Stan Sclaroff
Margrit Betke
Zhe Lin
Xiaohui Shen
Brian L. Price
R. Měch
23
115
0
26 Jul 2016
Learning Aligned Cross-Modal Representations from Weakly Aligned Data
Learning Aligned Cross-Modal Representations from Weakly Aligned Data
Lluis Castrejon
Y. Aytar
Carl Vondrick
Hamed Pirsiavash
Antonio Torralba
SSL
DRL
AI4TS
37
167
0
25 Jul 2016
An Actor-Critic Algorithm for Sequence Prediction
An Actor-Critic Algorithm for Sequence Prediction
Dzmitry Bahdanau
Philemon Brakel
Kelvin Xu
Anirudh Goyal
Ryan J. Lowe
Joelle Pineau
Aaron Courville
Yoshua Bengio
57
636
0
24 Jul 2016
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition
Jun Liu
Amir Shahroudy
Dong Xu
Gang Wang
49
1,099
0
24 Jul 2016
Hierarchical Attention Network for Action Recognition in Videos
Hierarchical Attention Network for Action Recognition in Videos
Yilin Wang
Suhang Wang
Jiliang Tang
Neil O'Hare
Yi-Ju Chang
Baoxin Li
BDL
30
82
0
21 Jul 2016
Constructing a Natural Language Inference Dataset using Generative
  Neural Networks
Constructing a Natural Language Inference Dataset using Generative Neural Networks
Janez Starc
Dunja Mladenić
27
7
0
20 Jul 2016
Visual Question Answering: A Survey of Methods and Datasets
Visual Question Answering: A Survey of Methods and Datasets
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
41
413
0
20 Jul 2016
HeMIS: Hetero-Modal Image Segmentation
HeMIS: Hetero-Modal Image Segmentation
Mohammad Havaei
N. Guizard
Nicolas Chapados
Yoshua Bengio
MedIm
26
261
0
18 Jul 2016
Weakly Supervised Learning of Heterogeneous Concepts in Videos
Weakly Supervised Learning of Heterogeneous Concepts in Videos
Sohil Shah
K. Kulkarni
Arijit Biswas
Ankit Gandhi
Om Deshmukh
L. Davis
32
2
0
12 Jul 2016
VideoLSTM Convolves, Attends and Flows for Action Recognition
VideoLSTM Convolves, Attends and Flows for Action Recognition
Zhenyang Li
E. Gavves
Mihir Jain
Cees G. M. Snoek
50
463
0
06 Jul 2016
Previous
123...656667...697071
Next