ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
A Deep Neural Framework for Image Caption Generation Using GRU-Based
  Attention Mechanism
A Deep Neural Framework for Image Caption Generation Using GRU-Based Attention Mechanism
Rashid Khan
Shujah Islam
Khadija Kanwal
Mansoor Iqbal
Md. Imran Hossain
Z. Ye
3DV
35
18
0
03 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
104
109
0
02 Mar 2022
MSCTD: A Multimodal Sentiment Chat Translation Dataset
MSCTD: A Multimodal Sentiment Chat Translation Dataset
Yunlong Liang
Fandong Meng
Jinan Xu
Jinan Xu
Jie Zhou
60
22
0
28 Feb 2022
Interactive Machine Learning for Image Captioning
Interactive Machine Learning for Image Captioning
Mareike Hartmann
Aliki Anagnostopoulou
Daniel Sonntag
VLM
45
4
0
28 Feb 2022
Think Global, Act Local: Dual-scale Graph Transformer for
  Vision-and-Language Navigation
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
LM&Ro
94
150
0
23 Feb 2022
Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion
  Network for Action Recognition
Skeleton Sequence and RGB Frame Based Multi-Modality Feature Fusion Network for Action Recognition
Xiaoguang Zhu
Ye Zhu
Haoyu Wang
Honglin Wen
Yan Yan
Peilin Liu
101
28
0
23 Feb 2022
VU-BERT: A Unified framework for Visual Dialog
VU-BERT: A Unified framework for Visual Dialog
Tong Ye
Shijing Si
Jianzong Wang
Rui Wang
Ning Cheng
Jing Xiao
MLLM
96
5
0
22 Feb 2022
CaMEL: Mean Teacher Learning for Image Captioning
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViTVLM
84
30
0
21 Feb 2022
OG-SGG: Ontology-Guided Scene Graph Generation. A Case Study in Transfer
  Learning for Telepresence Robotics
OG-SGG: Ontology-Guided Scene Graph Generation. A Case Study in Transfer Learning for Telepresence Robotics
Fernando Amodeo
F. Caballero
N. Díaz-Rodríguez
L. Merino
LM&Ro
97
10
0
21 Feb 2022
VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
186
228
0
18 Feb 2022
XFBoost: Improving Text Generation with Controllable Decoders
XFBoost: Improving Text Generation with Controllable Decoders
Xiangyu Peng
Michael Sollami
75
1
0
16 Feb 2022
A Survey on Dynamic Neural Networks for Natural Language Processing
A Survey on Dynamic Neural Networks for Natural Language Processing
Canwen Xu
Julian McAuley
AI4CE
103
29
0
15 Feb 2022
Conditional Generation Net for Medication Recommendation
Conditional Generation Net for Medication Recommendation
Rui Wu
Zhaopeng Qiu
Jiacheng Jiang
Guilin Qi
Xian Wu
68
93
0
14 Feb 2022
GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges
GAMMA Challenge:Glaucoma grAding from Multi-Modality imAges
Junde Wu
Huihui Fang
Fei Li
Huazhu Fu
Fengbin Lin
...
Q. Hu
Hrvoje Bogunović
J. Orlando
Xiulan Zhang
Yanwu Xu
90
64
0
14 Feb 2022
Multi-Modal Knowledge Graph Construction and Application: A Survey
Multi-Modal Knowledge Graph Construction and Application: A Survey
Xiangru Zhu
Zhixu Li
Xiaodan Wang
Xueyao Jiang
Penglei Sun
Xuwu Wang
Yanghua Xiao
N. Yuan
73
167
0
11 Feb 2022
Multi-Modal Fusion for Sensorimotor Coordination in Steering Angle
  Prediction
Multi-Modal Fusion for Sensorimotor Coordination in Steering Angle Prediction
Farzeen Munir
Shoaib Azam
Byung-geun Lee
M. Jeon
55
5
0
11 Feb 2022
Bench-Marking And Improving Arabic Automatic Image Captioning Through
  The Use Of Multi-Task Learning Paradigm
Bench-Marking And Improving Arabic Automatic Image Captioning Through The Use Of Multi-Task Learning Paradigm
Muhy Eddin Za'ter
Bashar Talafha
VLM
54
2
0
11 Feb 2022
ACORT: A Compact Object Relation Transformer for Parameter Efficient
  Image Captioning
ACORT: A Compact Object Relation Transformer for Parameter Efficient Image Captioning
J. Tan
Y. Tan
C. Chan
Joon Huang Chuah
VLMViT
83
19
0
11 Feb 2022
Describing image focused in cognitive and visual details for visually
  impaired people: An approach to generating inclusive paragraphs
Describing image focused in cognitive and visual details for visually impaired people: An approach to generating inclusive paragraphs
Daniel Louzada Fernandes
Marcos Henrique Fonseca Ribeiro
F. Cerqueira
Michel Melo Silva
46
7
0
10 Feb 2022
Image Difference Captioning with Pre-training and Contrastive Learning
Image Difference Captioning with Pre-training and Contrastive Learning
Linli Yao
Weiying Wang
Qin Jin
SSLVLM
86
43
0
09 Feb 2022
Inference of captions from histopathological patches
Inference of captions from histopathological patches
M. Tsuneki
F. Kanavati
89
32
0
07 Feb 2022
FEAT: Face Editing with Attention
FEAT: Face Editing with Attention
Xianxu Hou
Linlin Shen
Or Patashnik
Daniel Cohen-Or
Hui Huang
CVBM
66
20
0
06 Feb 2022
Continual Attentive Fusion for Incremental Learning in Semantic
  Segmentation
Continual Attentive Fusion for Incremental Learning in Semantic Segmentation
Guanglei Yang
Enrico Fini
Dan Xu
Paolo Rota
Mingli Ding
Hao Tang
Xavier Alameda-Pineda
Elisa Ricci
CLL
78
22
0
01 Feb 2022
Interpretable and Generalizable Graph Learning via Stochastic Attention
  Mechanism
Interpretable and Generalizable Graph Learning via Stochastic Attention Mechanism
Siqi Miao
Miaoyuan Liu
Pan Li
99
215
0
31 Jan 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
136
101
0
31 Jan 2022
Learning Intuitive Policies Using Action Features
Learning Intuitive Policies Using Action Features
Mingwei Ma
Jizhou Liu
Samuel Sokota
Max Kleiman-Weiner
Jakob N. Foerster
95
4
0
29 Jan 2022
Automatic Audio Captioning using Attention weighted Event based
  Embeddings
Automatic Audio Captioning using Attention weighted Event based Embeddings
Swapnil Bhosale
Rupayan Chakraborty
Sunil Kumar Kopparapu
75
0
0
28 Jan 2022
Natural Language Descriptions of Deep Visual Features
Natural Language Descriptions of Deep Visual Features
Evan Hernandez
Sarah Schwettmann
David Bau
Teona Bagashvili
Antonio Torralba
Jacob Andreas
MILM
320
126
0
26 Jan 2022
A Bayesian Based Deep Unrolling Algorithm for Single-Photon Lidar
  Systems
A Bayesian Based Deep Unrolling Algorithm for Single-Photon Lidar Systems
JaKeoung Koo
Abderrahim Halimi
S. Mclaughlin
BDL3DV
73
18
0
26 Jan 2022
Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis
  for Eyewear Devices
Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis for Eyewear Devices
Yingying Zhao
Yuhu Chang
Yutian Lu
Yujiang Wang
Mingzhi Dong
...
Robert P. Dick
Fan Yang
Tun Lu
Ning Gu
L. Shang
78
10
0
24 Jan 2022
The Paradox of Choice: Using Attention in Hierarchical Reinforcement
  Learning
The Paradox of Choice: Using Attention in Hierarchical Reinforcement Learning
A. Nica
Khimya Khetarpal
Doina Precup
62
4
0
24 Jan 2022
One-Shot Learning on Attributed Sequences
One-Shot Learning on Attributed Sequences
Zhongfang Zhuang
Xiangnan Kong
Elke A. Rundensteiner
Aditya Arora
Jihane Zouaoui
BDL
82
2
0
23 Jan 2022
An Integrated Approach for Video Captioning and Applications
An Integrated Approach for Video Captioning and Applications
Soheyla Amirian
T. Taha
Khaled Rasheed
H. Arabnia
64
1
0
23 Jan 2022
Online Attentive Kernel-Based Temporal Difference Learning
Online Attentive Kernel-Based Temporal Difference Learning
Guang Yang
Xingguo Chen
Shangdong Yang
Huihui Wang
Shaokang Dong
Yang Gao
OffRL
28
3
0
22 Jan 2022
Learning-by-Novel-View-Synthesis for Full-Face Appearance-Based 3D Gaze
  Estimation
Learning-by-Novel-View-Synthesis for Full-Face Appearance-Based 3D Gaze Estimation
Jiawei Qin
Takuru Shimoyama
Yusuke Sugano
3DHCVBM
81
17
0
20 Jan 2022
Context-Aware Scene Prediction Network (CASPNet)
Context-Aware Scene Prediction Network (CASPNet)
Maximilian Schäfer
Kun-li Zhao
Markus Bühren
A. Kummert
81
11
0
18 Jan 2022
A Literature Survey of Recent Advances in Chatbots
A Literature Survey of Recent Advances in Chatbots
Guendalina Caldarini
Sardar F. Jaf
K. McGarry
AI4CE
84
290
0
17 Jan 2022
Emergence of Machine Language: Towards Symbolic Intelligence with Neural
  Networks
Emergence of Machine Language: Towards Symbolic Intelligence with Neural Networks
Yuqi Wang
Xu-Yao Zhang
Cheng-Lin Liu
Zhaoxiang Zhang
57
2
0
14 Jan 2022
Prior Knowledge Enhances Radiology Report Generation
Prior Knowledge Enhances Radiology Report Generation
Song Wang
Liyan Tang
Mingquan Lin
George Shih
Ying Ding
Yifan Peng
MedIm
67
24
0
11 Jan 2022
Wind Park Power Prediction: Attention-Based Graph Networks and Deep
  Learning to Capture Wake Losses
Wind Park Power Prediction: Attention-Based Graph Networks and Deep Learning to Capture Wake Losses
Lars Odegaard Bentsen
N. Warakagoda
R. Stenbro
P. Engelstad
75
16
0
10 Jan 2022
Glance and Focus Networks for Dynamic Visual Recognition
Glance and Focus Networks for Dynamic Visual Recognition
Gao Huang
Yulin Wang
Kangchen Lv
Haojun Jiang
Wenhui Huang
Pengfei Qi
S. Song
3DH
150
50
0
09 Jan 2022
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for
  Supervised Cross-Modal Retrieval
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval
Zhixiong Zeng
Wenji Mao
VLM
77
18
0
08 Jan 2022
Repurposing Existing Deep Networks for Caption and Aesthetic-Guided
  Image Cropping
Repurposing Existing Deep Networks for Caption and Aesthetic-Guided Image Cropping
Nora Horanyi
Kedi Xia
K. M. Yi
Abhishake Kumar Bojja
A. Leonardis
H. Chang
101
12
0
07 Jan 2022
Compact Bidirectional Transformer for Image Captioning
Compact Bidirectional Transformer for Image Captioning
Yuanen Zhou
Zhenzhen Hu
Daqing Liu
Huixia Ben
Meng Wang
VLM
67
17
0
06 Jan 2022
Discrete and continuous representations and processing in deep learning:
  Looking forward
Discrete and continuous representations and processing in deep learning: Looking forward
Ruben Cartuyvels
Graham Spinks
Marie-Francine Moens
OCL
95
20
0
04 Jan 2022
StyleM: Stylized Metrics for Image Captioning Built with Contrastive
  N-grams
StyleM: Stylized Metrics for Image Captioning Built with Contrastive N-grams
Chengxi Li
Brent Harrison
110
3
0
04 Jan 2022
Interactive Attention AI to translate low light photos to captions for
  night scene understanding in women safety
Interactive Attention AI to translate low light photos to captions for night scene understanding in women safety
A. Rajagopal
V. Nirmala
Arun Muthuraj Vedamanickam
89
0
0
04 Jan 2022
Self-attention Multi-view Representation Learning with
  Diversity-promoting Complementarity
Self-attention Multi-view Representation Learning with Diversity-promoting Complementarity
Jian Liu
Xi-hao Ding
Run-kun Lu
Xiong-lin Luo
77
1
0
01 Jan 2022
Deconfounded Visual Grounding
Deconfounded Visual Grounding
Jianqiang Huang
Yu Qin
Jiaxin Qi
Qianru Sun
Hanwang Zhang
CMLObjD
63
33
0
31 Dec 2021
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional
  Vision-Language Generation
ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation
Han Zhang
Weichong Yin
Yewei Fang
Lanxin Li
Boqiang Duan
Zhihua Wu
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
71
59
0
31 Dec 2021
Previous
123...151617...697071
Next