Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,510 papers shown
Title
Learning Robust Video Synchronization without Annotations
P. Wieschollek
Ido Freeman
Hendrik P. A. Lensch
9
7
0
19 Oct 2016
Spatio-Temporal Attention Models for Grounded Video Captioning
M. Zanfir
Elisabeta Marinoiu
C. Sminchisescu
35
50
0
17 Oct 2016
Recurrent 3D Attentional Networks for End-to-End Active Object Recognition
Min Liu
Yifei Shi
Lintao Zheng
Kai Xu
Hui Huang
Dinesh Manocha
3DPC
21
10
0
14 Oct 2016
Video Fill in the Blank with Merging LSTMs
Amir Mazaheri
Dong-Ming Zhang
M. Shah
32
18
0
13 Oct 2016
Generating captions without looking beyond objects
Hendrik Heuer
Christof Monz
A. Smeulders
25
17
0
12 Oct 2016
Attention and Anticipation in Fast Visual-Inertial Navigation
Luca Carlone
S. Karaman
32
77
0
11 Oct 2016
Latent Sequence Decompositions
William Chan
Yu Zhang
Quoc V. Le
Navdeep Jaitly
24
62
0
10 Oct 2016
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering
Youngjae Yu
Hyungjin Ko
Jongwook Choi
Gunhee Kim
14
230
0
10 Oct 2016
Understanding intermediate layers using linear classifier probes
Guillaume Alain
Yoshua Bengio
FAtt
53
900
0
05 Oct 2016
Visual Question Answering: Datasets, Algorithms, and Future Challenges
Kushal Kafle
Christopher Kanan
OOD
33
235
0
05 Oct 2016
A Survey of Multi-View Representation Learning
Yingming Li
Ming Yang
Zhongfei Zhang
AI4TS
3DV
37
509
0
03 Oct 2016
Controlling Output Length in Neural Encoder-Decoders
Yuta Kikuchi
Graham Neubig
Ryohei Sasano
Hiroya Takamura
Manabu Okumura
30
242
0
30 Sep 2016
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Yunchen Pu
Zhe Gan
Ricardo Henao
Xin Yuan
Chunyuan Li
Andrew Stevens
Lawrence Carin
BDL
CoGe
45
746
0
28 Sep 2016
Character Sequence Models for ColorfulWords
Kazuya Kawakami
Chris Dyer
Bryan R. Routledge
Noah A. Smith
3DV
28
17
0
28 Sep 2016
Learning Language-Visual Embedding for Movie Understanding with Natural-Language
Atousa Torabi
Niket Tandon
Leonid Sigal
22
97
0
26 Sep 2016
Visual Fashion-Product Search at SK Planet
Taewan Kim
Seyeong Kim
Sangil Na
Hayoon Kim
Moonki Kim
Beyeongki Jeon
11
6
0
26 Sep 2016
Language as a Latent Variable: Discrete Generative Models for Sentence Compression
Yishu Miao
Phil Blunsom
201
223
0
23 Sep 2016
The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA)
Andrew Shin
Yoshitaka Ushiku
Tatsuya Harada
52
14
0
21 Sep 2016
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
30
851
0
21 Sep 2016
Enhanced LSTM for Natural Language Inference
Qian Chen
Xiao-Dan Zhu
Zhenhua Ling
Si Wei
Hui Jiang
Diana Inkpen
LRM
ReLM
41
1,127
0
20 Sep 2016
Image-to-Markup Generation with Coarse-to-Fine Attention
Yuntian Deng
Anssi Kanervisto
Jeffrey Ling
Alexander M. Rush
19
226
0
16 Sep 2016
Predicting Shot Making in Basketball Learnt from Adversarial Multiagent Trajectories
Mark Harmon
Abdolghani Ebrahimi
P. Lucey
Diego Klabjan
GAN
25
18
0
15 Sep 2016
Multimodal Attention for Neural Machine Translation
Ozan Caglayan
Loïc Barrault
Fethi Bougares
34
75
0
13 Sep 2016
Read, Tag, and Parse All at Once, or Fully-neural Dependency Parsing
J. Chorowski
Michal Zapotoczny
Paweł Rychlikowski
27
5
0
12 Sep 2016
The Role of Context Selection in Object Detection
Ruichi Yu
Xi Chen
Vlad I. Morariu
L. Davis
22
42
0
09 Sep 2016
Optimizing Recurrent Neural Networks Architectures under Time Constraints
Junqi Jin
Ziang Yan
Kun Fu
Nan Jiang
Changshui Zhang
24
2
0
29 Aug 2016
A Boundary Tilting Persepective on the Phenomenon of Adversarial Examples
T. Tanay
Lewis D. Griffin
AAML
25
270
0
27 Aug 2016
Learning to generalize to new compositions in image understanding
Yuval Atzmon
Jonathan Berant
Vahid Kezami
Amir Globerson
Gal Chechik
26
67
0
27 Aug 2016
Title Generation for User Generated Videos
Kuo-Hao Zeng
Tseng-Hung Chen
Juan Carlos Niebles
Min Sun
35
69
0
25 Aug 2016
Context Gates for Neural Machine Translation
Zhaopeng Tu
Yang Liu
Zhengdong Lu
Xiaohua Liu
Hang Li
29
137
0
22 Aug 2016
phi-LSTM: A Phrase-based Hierarchical LSTM Model for Image Captioning
Y. Tan
Chee Seng Chan
VLM
22
29
0
20 Aug 2016
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism
Edward Choi
M. T. Bahadori
Joshua A. Kulas
A. Schuetz
Walter F. Stewart
Jimeng Sun
AI4TS
60
1,232
0
19 Aug 2016
Modeling Human Reading with Neural Attention
Michael Hahn
Frank Keller
28
55
0
19 Aug 2016
Seeing with Humans: Gaze-Assisted Neural Image Captioning
Yusuke Sugano
Andreas Bulling
24
68
0
18 Aug 2016
Temporal Attention Model for Neural Machine Translation
B. Sankaran
Haitao Mi
Yaser Al-Onaizan
Abe Ittycheriah
17
62
0
09 Aug 2016
End-to-End Localization and Ranking for Relative Attributes
Krishna Kumar Singh
Yong Jae Lee
27
76
0
09 Aug 2016
Learning Online Alignments with Continuous Rewards Policy Gradient
Yuping Luo
Chung-Cheng Chiu
Navdeep Jaitly
Ilya Sutskever
OffRL
18
46
0
03 Aug 2016
Modeling Context Between Objects for Referring Expression Understanding
Varun K. Nagaraja
Vlad I. Morariu
Larry S. Davis
40
145
0
01 Aug 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
53
1,233
0
31 Jul 2016
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
36
1,888
0
29 Jul 2016
Salient Object Subitizing
Jianming Zhang
Shugao Ma
M. Sameki
Stan Sclaroff
Margrit Betke
Zhe Lin
Xiaohui Shen
Brian L. Price
R. Měch
23
115
0
26 Jul 2016
Learning Aligned Cross-Modal Representations from Weakly Aligned Data
Lluis Castrejon
Y. Aytar
Carl Vondrick
Hamed Pirsiavash
Antonio Torralba
SSL
DRL
AI4TS
37
167
0
25 Jul 2016
An Actor-Critic Algorithm for Sequence Prediction
Dzmitry Bahdanau
Philemon Brakel
Kelvin Xu
Anirudh Goyal
Ryan J. Lowe
Joelle Pineau
Aaron Courville
Yoshua Bengio
57
636
0
24 Jul 2016
Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition
Jun Liu
Amir Shahroudy
Dong Xu
Gang Wang
49
1,099
0
24 Jul 2016
Hierarchical Attention Network for Action Recognition in Videos
Yilin Wang
Suhang Wang
Jiliang Tang
Neil O'Hare
Yi-Ju Chang
Baoxin Li
BDL
30
82
0
21 Jul 2016
Constructing a Natural Language Inference Dataset using Generative Neural Networks
Janez Starc
Dunja Mladenić
27
7
0
20 Jul 2016
Visual Question Answering: A Survey of Methods and Datasets
Qi Wu
Damien Teney
Peng Wang
Chunhua Shen
A. Dick
Anton Van Den Hengel
41
413
0
20 Jul 2016
HeMIS: Hetero-Modal Image Segmentation
Mohammad Havaei
N. Guizard
Nicolas Chapados
Yoshua Bengio
MedIm
26
261
0
18 Jul 2016
Weakly Supervised Learning of Heterogeneous Concepts in Videos
Sohil Shah
K. Kulkarni
Arijit Biswas
Ankit Gandhi
Om Deshmukh
L. Davis
32
2
0
12 Jul 2016
VideoLSTM Convolves, Attends and Flows for Action Recognition
Zhenyang Li
E. Gavves
Mihir Jain
Cees G. M. Snoek
50
463
0
06 Jul 2016
Previous
1
2
3
...
65
66
67
...
69
70
71
Next