ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXivPDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,513 papers shown
Title
Exploring modality-agnostic representations for music classification
Exploring modality-agnostic representations for music classification
Ho-Hsiang Wu
Magdalena Fuentes
J. P. Bello
31
4
0
02 Jun 2021
Towards Efficient Cross-Modal Visual Textual Retrieval using
  Transformer-Encoder Deep Features
Towards Efficient Cross-Modal Visual Textual Retrieval using Transformer-Encoder Deep Features
Nicola Messina
Giuseppe Amato
Fabrizio Falchi
Claudio Gennaro
Stéphane Marchand-Maillet
22
7
0
01 Jun 2021
ACE-NODE: Attentive Co-Evolving Neural Ordinary Differential Equations
ACE-NODE: Attentive Co-Evolving Neural Ordinary Differential Equations
Sheo Yon Jhin
Minju Jo
Taeyong Kong
Jinsung Jeon
Noseong Park
BDL
37
13
0
31 May 2021
Cascaded Head-colliding Attention
Cascaded Head-colliding Attention
Lin Zheng
Zhiyong Wu
Lingpeng Kong
34
2
0
31 May 2021
Q-attention: Enabling Efficient Learning for Vision-based Robotic
  Manipulation
Q-attention: Enabling Efficient Learning for Vision-based Robotic Manipulation
Stephen James
Andrew J. Davison
22
122
0
31 May 2021
Data Fusion for Deep Learning on Transport Mode Detection: A Case Study
Data Fusion for Deep Learning on Transport Mode Detection: A Case Study
Hugues Moreau
A. Vassilev
Liming Chen
29
2
0
31 May 2021
Multiscale IoU: A Metric for Evaluation of Salient Object Detection with
  Fine Structures
Multiscale IoU: A Metric for Evaluation of Salient Object Detection with Fine Structures
Azim Ahmadzadeh
Dustin J. Kempton
Yang Chen
R. Angryk
24
6
0
30 May 2021
Longer Version for "Deep Context-Encoding Network for Retinal Image
  Captioning"
Longer Version for "Deep Context-Encoding Network for Retinal Image Captioning"
Jia-Hong Huang
Ting-Wei Wu
Chao-Han Huck Yang
Marcel Worring
MedIm
25
28
0
30 May 2021
Towards Diverse Paragraph Captioning for Untrimmed Videos
Towards Diverse Paragraph Captioning for Untrimmed Videos
Yuqing Song
Shizhe Chen
Qin Jin
26
37
0
30 May 2021
Maintaining Common Ground in Dynamic Environments
Maintaining Common Ground in Dynamic Environments
Takuma Udagawa
Akiko Aizawa
27
12
0
29 May 2021
FoveaTer: Foveated Transformer for Image Classification
FoveaTer: Foveated Transformer for Image Classification
Aditya Jonnalagadda
Wenjie Wang
B. S. Manjunath
Miguel P. Eckstein
ViT
43
23
0
29 May 2021
Recursive Contour Saliency Blending Network for Accurate Salient Object
  Detection
Recursive Contour Saliency Blending Network for Accurate Salient Object Detection
Y. Yun
Takahiro Tsubono
53
58
0
28 May 2021
New Encoder Learning for Captioning Heavy Rain Images via Semantic
  Visual Feature Matching
New Encoder Learning for Captioning Heavy Rain Images via Semantic Visual Feature Matching
Chang-Hwan Son
Pung-Hwi Ye
33
3
0
28 May 2021
THINK: A Novel Conversation Model for Generating Grammatically Correct
  and Coherent Responses
THINK: A Novel Conversation Model for Generating Grammatically Correct and Coherent Responses
Bin Sun
Shaoxiong Feng
Yiwei Li
Jiamou Liu
Kan Li
27
3
0
28 May 2021
Recent advances and clinical applications of deep learning in medical
  image analysis
Recent advances and clinical applications of deep learning in medical image analysis
Xuxin Chen
Ximing Wang
Kecheng Zhang
K. Fung
T. Thai
K. Moore
Robert S. Mannel
Hong Liu
B. Zheng
Y. Qiu
OOD
23
577
0
27 May 2021
Cardiac Segmentation on CT Images through Shape-Aware Contour Attentions
Cardiac Segmentation on CT Images through Shape-Aware Contour Attentions
Sanguk Park
Minyoung Chung
14
15
0
27 May 2021
GCNBoost: Artwork Classification by Label Propagation through a
  Knowledge Graph
GCNBoost: Artwork Classification by Label Propagation through a Knowledge Graph
Cheikh Brahim El Vaigh
Noa Garcia
B. Renoust
Chenhui Chu
Yuta Nakashima
Hajime Nagahara
11
24
0
25 May 2021
Writing by Memorizing: Hierarchical Retrieval-based Medical Report
  Generation
Writing by Memorizing: Hierarchical Retrieval-based Medical Report Generation
Xingyi Yang
Muchao Ye
Quanzeng You
Fenglong Ma
MedIm
24
38
0
25 May 2021
Multi-modal Understanding and Generation for Medical Images and Text via
  Vision-Language Pre-Training
Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training
Jong Hak Moon
HyunGyung Lee
W. Shin
Young-Hak Kim
Edward Choi
MedIm
34
153
0
24 May 2021
Automated Knee X-ray Report Generation
Automated Knee X-ray Report Generation
Aydan Gasimova
Giovanni Montana
Daniel Rueckert
MedIm
17
1
0
22 May 2021
Flexible Compositional Learning of Structured Visual Concepts
Flexible Compositional Learning of Structured Visual Concepts
Yanli Zhou
Brenden M. Lake
OCL
CoGe
11
7
0
20 May 2021
Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks
Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks
Thorben Funke
Megha Khosla
Mandeep Rathee
Avishek Anand
FAtt
43
38
0
18 May 2021
Dependent Multi-Task Learning with Causal Intervention for Image
  Captioning
Dependent Multi-Task Learning with Causal Intervention for Image Captioning
Wenqing Chen
Jidong Tian
Caoyun Fan
Hao He
Yaohui Jin
CML
32
6
0
18 May 2021
I2C2W: Image-to-Character-to-Word Transformers for Accurate Scene Text
  Recognition
I2C2W: Image-to-Character-to-Word Transformers for Accurate Scene Text Recognition
Chuhui Xue
Jiaxing Huang
Wenqing Zhang
Shijian Lu
Changhu Wang
S. Bai
38
16
0
18 May 2021
Finding a Needle in a Haystack: Tiny Flying Object Detection in 4K
  Videos using a Joint Detection-and-Tracking Approach
Finding a Needle in a Haystack: Tiny Flying Object Detection in 4K Videos using a Joint Detection-and-Tracking Approach
Ryota Yoshihashi
Rei Kawakami
Shaodi You
T. Trinh
M. Iida
T. Naemura
ObjD
VOT
34
3
0
18 May 2021
Empirical Analysis of Image Caption Generation using Deep Learning
Aditya R. Bhattacharya
Eshwar Shamanna Girishekar
Padmakar Anil Deshpande
21
1
0
14 May 2021
Audio Captioning with Composition of Acoustic and Semantic Information
Audio Captioning with Composition of Acoustic and Semantic Information
Aysegül Özkaya Eren
M. Sert
46
3
0
13 May 2021
SAFIN: Arbitrary Style Transfer With Self-Attentive Factorized Instance
  Normalization
SAFIN: Arbitrary Style Transfer With Self-Attentive Factorized Instance Normalization
Aaditya Singh
Shreeshail Hingane
Xinyu Gong
Zhangyang Wang
13
19
0
13 May 2021
Connecting What to Say With Where to Look by Modeling Human Attention
  Traces
Connecting What to Say With Where to Look by Modeling Human Attention Traces
Zihang Meng
Licheng Yu
Ning Zhang
Tamara L. Berg
Babak Damavandi
Vikas Singh
Amy Bearman
47
25
0
12 May 2021
Instance-aware Remote Sensing Image Captioning with Cross-hierarchy
  Attention
Instance-aware Remote Sensing Image Captioning with Cross-hierarchy Attention
Chengze Wang
Zhiyu Jiang
Yuan Yuan
21
11
0
11 May 2021
Primitive Representation Learning for Scene Text Recognition
Primitive Representation Learning for Scene Text Recognition
Ruijie Yan
Liangrui Peng
Shanyu Xiao
Gang Yao
34
66
0
10 May 2021
T-EMDE: Sketching-based global similarity for cross-modal retrieval
T-EMDE: Sketching-based global similarity for cross-modal retrieval
Barbara Rychalska
Mikolaj Wieczorek
Jacek Dąbrowski
38
0
0
10 May 2021
KDExplainer: A Task-oriented Attention Model for Explaining Knowledge
  Distillation
KDExplainer: A Task-oriented Attention Model for Explaining Knowledge Distillation
Mengqi Xue
Mingli Song
Xinchao Wang
Ying Chen
Xingen Wang
Xiuming Zhang
20
10
0
10 May 2021
Matching Visual Features to Hierarchical Semantic Topics for Image
  Paragraph Captioning
Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning
D. Guo
Ruiying Lu
Bo Chen
Zequn Zeng
Mingyuan Zhou
VLM
39
9
0
10 May 2021
Graph Attention Networks with Positional Embeddings
Graph Attention Networks with Positional Embeddings
Liheng Ma
Reihaneh Rabbany
Adriana Romero Soriano
GNN
33
20
0
09 May 2021
A Hybrid Model for Combining Neural Image Caption and k-Nearest Neighbor
  Approach for Image Captioning
A Hybrid Model for Combining Neural Image Caption and k-Nearest Neighbor Approach for Image Captioning
Kartik Arora
Ajul Raj
Arun Goel
Seba Susan
19
0
0
09 May 2021
Improving the Faithfulness of Attention-based Explanations with
  Task-specific Information for Text Classification
Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification
G. Chrysostomou
Nikolaos Aletras
37
38
0
06 May 2021
Handwritten Mathematical Expression Recognition with Bidirectionally
  Trained Transformer
Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer
Wenqi Zhao
Liangcai Gao
Zuoyu Yan
Shuai Peng
Lin Du
Ziyin Zhang
ViT
35
53
0
06 May 2021
Exploring Explicit and Implicit Visual Relationships for Image
  Captioning
Exploring Explicit and Implicit Visual Relationships for Image Captioning
Zeliang Song
Xiaofei Zhou
21
7
0
06 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for
  Visual Tasks
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
28
475
0
05 May 2021
Soft-Attention Improves Skin Cancer Classification Performance
Soft-Attention Improves Skin Cancer Classification Performance
S. Datta
Mohammad Abuzar Shaikh
H. Srihari
Mingchen Gao
27
104
0
05 May 2021
A survey on VQA_Datasets and Approaches
A survey on VQA_Datasets and Approaches
Yeyun Zou
Qiyu Xie
50
18
0
02 May 2021
End-to-End Attention-based Image Captioning
End-to-End Attention-based Image Captioning
Carola Sundaramoorthy
Lin Ziwen Kelvin
Mahak Sarin
Shubham Gupta
ViT
24
6
0
30 Apr 2021
Learning Multi-Attention Context Graph for Group-Based Re-Identification
Learning Multi-Attention Context Graph for Group-Based Re-Identification
Yichao Yan
Jie Qin
Bingbing Ni
Jiaxin Chen
Li Liu
Fan Zhu
Weishi Zheng
Xiaokang Yang
Ling Shao
34
42
0
29 Apr 2021
Exploring Relational Context for Multi-Task Dense Prediction
Exploring Relational Context for Multi-Task Dense Prediction
David Brüggemann
Menelaos Kanakis
Anton Obukhov
Stamatios Georgoulis
Luc Van Gool
46
74
0
28 Apr 2021
Removing Word-Level Spurious Alignment between Images and
  Pseudo-Captions in Unsupervised Image Captioning
Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning
Ukyo Honda
Yoshitaka Ushiku
Atsushi Hashimoto
Taro Watanabe
Yuji Matsumoto
38
23
0
28 Apr 2021
CAGAN: Text-To-Image Generation with Combined Attention GANs
CAGAN: Text-To-Image Generation with Combined Attention GANs
Henning Schulze
Dogucan Yaman
Alexander Waibel
GAN
29
3
0
26 Apr 2021
MusCaps: Generating Captions for Music Audio
MusCaps: Generating Captions for Music Audio
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
59
36
0
24 Apr 2021
EXplainable Neural-Symbolic Learning (X-NeSyL) methodology to fuse deep
  learning representations with expert knowledge graphs: the MonuMAI cultural
  heritage use case
EXplainable Neural-Symbolic Learning (X-NeSyL) methodology to fuse deep learning representations with expert knowledge graphs: the MonuMAI cultural heritage use case
Natalia Díaz Rodríguez
Alberto Lamas
Jules Sanchez
Gianni Franchi
Ivan Donadello
Siham Tabik
David Filliat
P. Cruz
Rosana Montes
Francisco Herrera
54
77
0
24 Apr 2021
AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
Ran Ben Izhak
Alon Lahav
A. Tal
3DV
57
10
0
23 Apr 2021
Previous
123...212223...697071
Next