ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
Fluency-Guided Cross-Lingual Image Captioning
Fluency-Guided Cross-Lingual Image Captioning
Weiyu Lan
Xirong Li
Jianfeng Dong
71
95
0
15 Aug 2017
Deep Edge-Aware Saliency Detection
Deep Edge-Aware Saliency Detection
Jing Zhang
Yuchao Dai
Fatih Porikli
Mingyi He
60
15
0
15 Aug 2017
Situation Recognition with Graph Neural Networks
Situation Recognition with Graph Neural Networks
Ruiyu Li
Makarand Tapaswi
Renjie Liao
Jiaya Jia
R. Urtasun
Sanja Fidler
GNN
72
132
0
14 Aug 2017
Emotion Detection on TV Show Transcripts with Sequence-based
  Convolutional Neural Networks
Emotion Detection on TV Show Transcripts with Sequence-based Convolutional Neural Networks
Sayyed M. Zahiri
Jinho Choi
108
225
0
14 Aug 2017
Early Improving Recurrent Elastic Highway Network
Early Improving Recurrent Elastic Highway Network
Hyunsin Park
Chang D. Yoo
48
5
0
14 Aug 2017
Argument Labeling of Explicit Discourse Relations using LSTM Neural
  Networks
Argument Labeling of Explicit Discourse Relations using LSTM Neural Networks
Sohail Hooda
Leila Kosseim
25
9
0
11 Aug 2017
Attention-Aware Face Hallucination via Deep Reinforcement Learning
Attention-Aware Face Hallucination via Deep Reinforcement Learning
Qingxing Cao
Liang Lin
Yukai Shi
Xiaodan Liang
Guanbin Li
SupR
82
195
0
10 Aug 2017
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling
  for Visual Question Answering
Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering
Zhou Yu
Jun-chen Yu
Chenchao Xiang
Jianping Fan
Dacheng Tao
101
462
0
10 Aug 2017
TandemNet: Distilling Knowledge from Medical Images Using Diagnostic
  Reports as Optional Semantic References
TandemNet: Distilling Knowledge from Medical Images Using Diagnostic Reports as Optional Semantic References
Zizhao Zhang
Pingjun Chen
Manish Sapkota
Ling Yang
MedIm
76
69
0
10 Aug 2017
Hierarchically-Attentive RNN for Album Summarization and Storytelling
Hierarchically-Attentive RNN for Album Summarization and Storytelling
Licheng Yu
Joey Tianyi Zhou
Tamara L. Berg
92
66
0
09 Aug 2017
Learning to Disambiguate by Asking Discriminative Questions
Learning to Disambiguate by Asking Discriminative Questions
Yining Li
Chen Huang
Xiaoou Tang
Chen Change Loy
67
22
0
09 Aug 2017
Recent Trends in Deep Learning Based Natural Language Processing
Recent Trends in Deep Learning Based Natural Language Processing
Tom Young
Devamanyu Hazarika
Soujanya Poria
Min Zhang
135
2,848
0
09 Aug 2017
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual
  Cross Retrieval
Deep Binaries: Encoding Semantic-Rich Cues for Efficient Textual-Visual Cross Retrieval
Yuming Shen
Li Liu
Ling Shao
Jingkuan Song
65
49
0
08 Aug 2017
FoveaNet: Perspective-aware Urban Scene Parsing
FoveaNet: Perspective-aware Urban Scene Parsing
Xuzhao Li
Zequn Jie
Wei Wang
Changsong Liu
Jimei Yang
Xiaohui Shen
Zhe Lin
Qiang Chen
Shuicheng Yan
Jiashi Feng
3DPC
82
58
0
08 Aug 2017
GPLAC: Generalizing Vision-Based Robotic Skills using Weakly Labeled
  Images
GPLAC: Generalizing Vision-Based Robotic Skills using Weakly Labeled Images
Avi Singh
Larry Yang
Sergey Levine
61
23
0
07 Aug 2017
Structured Attentions for Visual Question Answering
Structured Attentions for Visual Question Answering
Chen Zhu
Yanpeng Zhao
Shuaiyi Huang
Kewei Tu
Yi-An Ma
FAtt
95
107
0
07 Aug 2017
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption
  Generator?
What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator?
Marc Tanti
Albert Gatt
K. Camilleri
48
56
0
07 Aug 2017
A Comparison of Neural Models for Word Ordering
A Comparison of Neural Models for Word Ordering
Eva Hasler
Felix Stahlberg
Marcus Tomalin
Adria de Gispert
Bill Byrne
48
20
0
05 Aug 2017
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for
  Visual Question Answering
Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering
Zhou Yu
Jun-chen Yu
Jianping Fan
Dacheng Tao
87
669
0
04 Aug 2017
MemexQA: Visual Memex Question Answering
MemexQA: Visual Memex Question Answering
Lu Jiang
Junwei Liang
Liangliang Cao
Yannis Kalantidis
S. Farfade
Alexander G. Hauptmann
46
28
0
04 Aug 2017
Sensor Transformation Attention Networks
Sensor Transformation Attention Networks
Stefan Braun
Daniel Neil
Enea Ceolini
Jithendar Anumula
Shih-Chii Liu
61
1
0
03 Aug 2017
Jointly Attentive Spatial-Temporal Pooling Networks for Video-based
  Person Re-Identification
Jointly Attentive Spatial-Temporal Pooling Networks for Video-based Person Re-Identification
Shuangjie Xu
Yu Cheng
Kang Gu
Yang Yang
Shiyu Chang
Pan Zhou
80
321
0
03 Aug 2017
Dual-Glance Model for Deciphering Social Relationships
Dual-Glance Model for Deciphering Social Relationships
Junnan Li
Yongkang Wong
Qi Zhao
Mohan Kankanhalli
62
81
0
02 Aug 2017
Attend and Predict: Understanding Gene Regulation by Selective Attention
  on Chromatin
Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin
Ritambhara Singh
Jack Lanchantin
Arshdeep Sekhon
Yanjun Qi
95
71
0
01 Aug 2017
A Continuous Relaxation of Beam Search for End-to-end Training of Neural
  Sequence Models
A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models
Kartik Goyal
Graham Neubig
Chris Dyer
Taylor Berg-Kirkpatrick
3DV
122
40
0
01 Aug 2017
Time-Dependent Representation for Neural Event Sequence Prediction
Time-Dependent Representation for Neural Event Sequence Prediction
Yang Li
Nan Du
Samy Bengio
AI4TS
96
59
0
31 Jul 2017
Scene Graph Generation from Objects, Phrases and Region Captions
Scene Graph Generation from Objects, Phrases and Region Captions
Yikang Li
Wanli Ouyang
Bolei Zhou
Kun Wang
Xiaogang Wang
119
505
0
31 Jul 2017
Deep Interactive Region Segmentation and Captioning
Deep Interactive Region Segmentation and Captioning
Ali Sharifi Boroujerdi
M. Khanian
M. Breuß
55
7
0
26 Jul 2017
Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks
Learning Bag-of-Features Pooling for Deep Convolutional Neural Networks
Nikolaos Passalis
Anastasios Tefas
86
70
0
25 Jul 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual
  Question Answering
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Peter Anderson
Xiaodong He
Chris Buehler
Damien Teney
Mark Johnson
Stephen Gould
Lei Zhang
AIMat
286
4,232
0
25 Jul 2017
Person Re-identification Using Visual Attention
Person Re-identification Using Visual Attention
Alireza Rahimpour
Liu Liu
A. Taalimi
Yang Song
Hairong Qi
167
28
0
23 Jul 2017
Deeply-Learned Part-Aligned Representations for Person Re-Identification
Deeply-Learned Part-Aligned Representations for Person Re-Identification
Liming Zhao
Xi Li
Jingdong Wang
Yueting Zhuang
118
754
0
23 Jul 2017
Attention-Based End-to-End Speech Recognition on Voice Search
Attention-Based End-to-End Speech Recognition on Voice Search
Changhao Shan
Junbo Zhang
Yujun Wang
Lei Xie
64
7
0
22 Jul 2017
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts
Xuwang Yin
Vicente Ordonez
VLM
100
55
0
22 Jul 2017
Neural Person Search Machines
Neural Person Search Machines
Hao Liu
Jiashi Feng
Zequn Jie
J. Karlekar
Bo Zhao
Meibin Qi
Jianguo Jiang
Shuicheng Yan
168
155
0
21 Jul 2017
Supervising Neural Attention Models for Video Captioning by Human Gaze
  Data
Supervising Neural Attention Models for Video Captioning by Human Gaze Data
Youngjae Yu
Jongwook Choi
Yeonhwa Kim
Kyung Yoo
Sang-Hun Lee
Gunhee Kim
86
69
0
19 Jul 2017
The Role of Conversation Context for Sarcasm Detection in Online
  Interactions
The Role of Conversation Context for Sarcasm Detection in Online Interactions
Debanjan Ghosh
Alexander R. Fabbri
Smaranda Muresan
68
74
0
19 Jul 2017
Skeleton-Based Human Action Recognition with Global Context-Aware
  Attention LSTM Networks
Skeleton-Based Human Action Recognition with Global Context-Aware Attention LSTM Networks
Jun Liu
G. Wang
Ling-yu Duan
Kamila Abdiyeva
Alex C. Kot
HAI
217
490
0
18 Jul 2017
Order-Free RNN with Visual Attention for Multi-Label Classification
Order-Free RNN with Visual Attention for Multi-Label Classification
Shang-Fu Chen
Yi-Chen Chen
Chih-Kuan Yeh
Y. Wang
121
145
0
18 Jul 2017
DeepProbe: Information Directed Sequence Understanding and Chatbot
  Design via Recurrent Neural Networks
DeepProbe: Information Directed Sequence Understanding and Chatbot Design via Recurrent Neural Networks
Zi Yin
Keng-hao Chang
Ruofei Zhang
62
51
0
18 Jul 2017
Aesthetic-Driven Image Enhancement by Adversarial Learning
Aesthetic-Driven Image Enhancement by Adversarial Learning
Yubin Deng
Chen Change Loy
Xiaoou Tang
GAN
66
124
0
17 Jul 2017
Latent Relational Metric Learning via Memory-based Attention for
  Collaborative Ranking
Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking
Yi Tay
Anh Tuan Luu
S. Hui
181
305
0
17 Jul 2017
Towards Bidirectional Hierarchical Representations for Attention-Based
  Neural Machine Translation
Towards Bidirectional Hierarchical Representations for Attention-Based Neural Machine Translation
Baosong Yang
Derek F. Wong
Tong Xiao
Lidia S. Chao
Jingbo Zhu
77
33
0
17 Jul 2017
Visual Question Answering with Memory-Augmented Networks
Visual Question Answering with Memory-Augmented Networks
Chao Ma
Chunhua Shen
A. Dick
Qi Wu
Peng Wang
Anton Van Den Hengel
Ian Reid
97
100
0
17 Jul 2017
Listening while Speaking: Speech Chain by Deep Learning
Listening while Speaking: Speech Chain by Deep Learning
Andros Tjandra
S. Sakti
Satoshi Nakamura
AuLLM
173
168
0
16 Jul 2017
CUNI System for the WMT17 Multimodal Translation Task
CUNI System for the WMT17 Multimodal Translation Task
Jindřich Helcl
Jindrich Libovický
67
11
0
14 Jul 2017
LIUM-CVC Submissions for WMT17 Multimodal Translation Task
LIUM-CVC Submissions for WMT17 Multimodal Translation Task
Ozan Caglayan
Walid Aransa
Adrien Bardet
Mercedes García-Martínez
Fethi Bougares
Loïc Barrault
Marc Masana
Luis Herranz
Joost van de Weijer
89
64
0
14 Jul 2017
Large-scale Video Classification guided by Batch Normalized LSTM
  Translator
Large-scale Video Classification guided by Batch Normalized LSTM Translator
Jae Hyeon Yoo
VLM
26
12
0
13 Jul 2017
Source-Target Inference Models for Spatial Instruction Understanding
Source-Target Inference Models for Spatial Instruction Understanding
Hao Tan
Joey Tianyi Zhou
54
14
0
12 Jul 2017
Deep Fisher Discriminant Learning for Mobile Hand Gesture Recognition
Deep Fisher Discriminant Learning for Mobile Hand Gesture Recognition
Chunyu Xie
Ce Li
Baochang Zhang
Chong Chen
Jungong Han
HAI
92
66
0
12 Jul 2017
Previous
123...596061...697071
Next