ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention
v1v2v3 (latest)

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXiv (abs)PDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,520 papers shown
Title
Competence-based Multimodal Curriculum Learning for Medical Report
  Generation
Competence-based Multimodal Curriculum Learning for Medical Report Generation
Fenglin Liu
Shen Ge
Yuexian Zou
Xian Wu
MedIm
152
140
0
24 Jun 2022
CLAMP: Prompt-based Contrastive Learning for Connecting Language and
  Animal Pose
CLAMP: Prompt-based Contrastive Learning for Connecting Language and Animal Pose
Xu Zhang
Wen Wang
Zhe Chen
Yufei Xu
Jing Zhang
Dacheng Tao
CLIPVLM
73
28
0
23 Jun 2022
Bypass Network for Semantics Driven Image Paragraph Captioning
Bypass Network for Semantics Driven Image Paragraph Captioning
Qinjie Zheng
Chaoyue Wang
Dadong Wang
122
1
0
21 Jun 2022
A Self-Guided Framework for Radiology Report Generation
A Self-Guided Framework for Radiology Report Generation
Jun Li
Shibo Li
Ying Hu
Huiren Tao
MedIm
66
22
0
19 Jun 2022
What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding
  without Text Inputs
What is Where by Looking: Weakly-Supervised Open-World Phrase-Grounding without Text Inputs
Tal Shaharabany
Yoad Tewel
Lior Wolf
ObjD
96
16
0
19 Jun 2022
Cross-task Attention Mechanism for Dense Multi-task Learning
Cross-task Attention Mechanism for Dense Multi-task Learning
Ivan Lopes
Tuan-Hung Vu
Raoul de Charette
100
29
0
17 Jun 2022
Image Captioning based on Feature Refinement and Reflective Decoding
Image Captioning based on Feature Refinement and Reflective Decoding
G. Alabduljabbar
Hafida Benhidour
Said Kerrache
3DV
40
3
0
16 Jun 2022
Human Eyes Inspired Recurrent Neural Networks are More Robust Against
  Adversarial Noises
Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises
Minkyu Choi
Yizhen Zhang
Kuan Han
Xiaokai Wang
Zhongming Liu
AAMLGAN
68
4
0
15 Jun 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
82
92
0
14 Jun 2022
Challenges in Applying Explainability Methods to Improve the Fairness of
  NLP Models
Challenges in Applying Explainability Methods to Improve the Fairness of NLP Models
Esma Balkir
S. Kiritchenko
I. Nejadgholi
Kathleen C. Fraser
94
37
0
08 Jun 2022
Code-DKT: A Code-based Knowledge Tracing Model for Programming Tasks
Code-DKT: A Code-based Knowledge Tracing Model for Programming Tasks
Yang Shi
Min Chi
Tiffany Barnes
T. Price
AI4Ed
85
26
0
07 Jun 2022
Improving Image Captioning with Control Signal of Sentence Quality
Improving Image Captioning with Control Signal of Sentence Quality
Zhangzi Zhu
Hong Qu
83
0
0
07 Jun 2022
Invariant Grounding for Video Question Answering
Invariant Grounding for Video Question Answering
Yicong Li
Xiang Wang
Junbin Xiao
Wei Ji
Tat-Seng Chua
OOD
79
99
0
06 Jun 2022
Transforming medical imaging with Transformers? A comparative review of
  key properties, current progresses, and future perspectives
Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives
Jun Li
Junyu Chen
Yucheng Tang
Ce Wang
Bennett A. Landman
S. K. Zhou
ViTOODMedIm
177
47
0
02 Jun 2022
CLIP4IDC: CLIP for Image Difference Captioning
CLIP4IDC: CLIP for Image Difference Captioning
Zixin Guo
Tong Wang
Jorma T. Laaksonen
VLM
76
30
0
01 Jun 2022
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset
Mohammad Faiyaz Khan
S. M. S. Shifath
Md. Saiful Islam
51
6
0
28 May 2022
Personalized PageRank Graph Attention Networks
Personalized PageRank Graph Attention Networks
Julie Choi
GNN
32
6
0
27 May 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Tianlin Li
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
Chen Chen
VLM
97
33
0
26 May 2022
Fine-grained Image Captioning with CLIP Reward
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Joey Tianyi Zhou
CLIP
234
79
0
26 May 2022
Multimodal Knowledge Alignment with Reinforcement Learning
Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu
Jiwan Chung
Heeseung Yun
Jack Hessel
Jinho Park
...
Prithviraj Ammanabrolu
Rowan Zellers
Ronan Le Bras
Gunhee Kim
Yejin Choi
VLM
160
37
0
25 May 2022
The Dialog Must Go On: Improving Visual Dialog via Generative
  Self-Training
The Dialog Must Go On: Improving Visual Dialog via Generative Self-Training
Gi-Cheon Kang
Sungdong Kim
Jin-Hwa Kim
Donghyun Kwak
Byoung-Tak Zhang
99
10
0
25 May 2022
Face2Text revisited: Improved data set and baseline results
Face2Text revisited: Improved data set and baseline results
Marc Tanti
Shaun Abdilla
A. Muscat
Claudia Borg
R. Farrugia
Albert Gatt
CVBM
37
3
0
24 May 2022
SelfReformer: Self-Refined Network with Transformer for Salient Object
  Detection
SelfReformer: Self-Refined Network with Transformer for Salient Object Detection
Y. Yun
Weisi Lin
ViT
124
29
0
23 May 2022
Explanatory machine learning for sequential human teaching
Explanatory machine learning for sequential human teaching
L. Ai
Johannes Langer
Stephen Muggleton
Ute Schmid
104
5
0
20 May 2022
Explainable Supervised Domain Adaptation
V. Kamakshi
N. C. Krishnan
73
2
0
20 May 2022
Let's Talk! Striking Up Conversations via Conversational Visual Question
  Generation
Let's Talk! Striking Up Conversations via Conversational Visual Question Generation
Shih-Han Chan
Tsai-Lun Yang
Yun-Wei Chu
Chi-Yang Hsu
Ting-Hao 'Kenneth' Huang
Yu-Shian Chiu
Lun-Wei Ku
48
1
0
19 May 2022
Support-set based Multi-modal Representation Enhancement for Video
  Captioning
Support-set based Multi-modal Representation Enhancement for Video Captioning
Xiaoya Chen
Jingkuan Song
Pengpeng Zeng
Lianli Gao
Hengtao Shen
73
4
0
19 May 2022
It Isn't Sh!tposting, It's My CAT Posting
It Isn't Sh!tposting, It's My CAT Posting
Parthsarthi Rawat
Sayan Das
Jorge Aguirre
Akhil Daphara
ViT
27
0
0
18 May 2022
Measuring Alignment Bias in Neural Seq2Seq Semantic Parsers
Measuring Alignment Bias in Neural Seq2Seq Semantic Parsers
Davide Locatelli
A. Quattoni
68
2
0
17 May 2022
Promoting Saliency From Depth: Deep Unsupervised RGB-D Saliency
  Detection
Promoting Saliency From Depth: Deep Unsupervised RGB-D Saliency Detection
Wei Ji
Jingjing Li
Qi Bi
Chuan Guo
Jieyan Liu
Li Cheng
72
45
0
15 May 2022
Video-based assessment of intraoperative surgical skill
Video-based assessment of intraoperative surgical skill
Sanchit Hira
Digvijay Singh
Tae Soo Kim
Shobhit Gupta
Gregory D. Hager
S. Sikder
S. Vedula
37
18
0
13 May 2022
Explainable Deep Learning Methods in Medical Image Classification: A
  Survey
Explainable Deep Learning Methods in Medical Image Classification: A Survey
Cristiano Patrício
João C. Neves
Luís F. Teixeira
XAI
103
60
0
10 May 2022
A Song of (Dis)agreement: Evaluating the Evaluation of Explainable
  Artificial Intelligence in Natural Language Processing
A Song of (Dis)agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing
Michael Neely
Stefan F. Schouten
Maurits J. R. Bleeker
Ana Lucic
XAI
86
18
0
09 May 2022
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual
  Context for Image Captioning
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Chia-Wen Kuo
Z. Kira
97
56
0
09 May 2022
Attract me to Buy: Advertisement Copywriting Generation with Multimodal
  Multi-structured Information
Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information
Zhipeng Zhang
Xinglin Hou
K. Niu
Zhongzhen Huang
T. Ge
Yuning Jiang
Qi Wu
Peifeng Wang
71
5
0
07 May 2022
From Easy to Hard: Learning Language-guided Curriculum for Visual
  Question Answering on Remote Sensing Data
From Easy to Hard: Learning Language-guided Curriculum for Visual Question Answering on Remote Sensing Data
Zhenghang Yuan
Lichao Mou
Q. Wang
Xiao Xiang Zhu
105
67
0
06 May 2022
Unsupervised Mismatch Localization in Cross-Modal Sequential Data with
  Application to Mispronunciations Localization
Unsupervised Mismatch Localization in Cross-Modal Sequential Data with Application to Mispronunciations Localization
Wei Wei
Hengguan Huang
Xiangming Gu
Hao Wang
Ye Wang
BDL
73
0
0
05 May 2022
Language Models Can See: Plugging Visual Controls in Text Generation
Language Models Can See: Plugging Visual Controls in Text Generation
Yixuan Su
Tian Lan
Yahui Liu
Fangyu Liu
Dani Yogatama
Yan Wang
Lingpeng Kong
Nigel Collier
VLMMLLM
113
98
0
05 May 2022
Diverse Image Captioning with Grounded Style
Diverse Image Captioning with Grounded Style
Franz Klein
Shweta Mahajan
S. Roth
79
7
0
03 May 2022
Inducing and Using Alignments for Transition-based AMR Parsing
Inducing and Using Alignments for Transition-based AMR Parsing
Andrew Drozdov
Jiawei Zhou
Radu Florian
Andrew McCallum
Tahira Naseem
Yoon Kim
Ramón Fernández Astudillo
78
27
0
03 May 2022
Cracking White-box DNN Watermarks via Invariant Neuron Transforms
Cracking White-box DNN Watermarks via Invariant Neuron Transforms
Yifan Yan
Xudong Pan
Yining Wang
Mi Zhang
Min Yang
AAML
55
14
0
30 Apr 2022
Controllable Image Captioning
Luka Maxwell
112
0
0
28 Apr 2022
Cross-modal Memory Networks for Radiology Report Generation
Cross-modal Memory Networks for Radiology Report Generation
Zhihong Chen
Yaling Shen
Yan Song
Xiang Wan
MedIm
121
262
0
28 Apr 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
129
183
0
27 Apr 2022
A survey on attention mechanisms for medical applications: are we moving
  towards better algorithms?
A survey on attention mechanisms for medical applications: are we moving towards better algorithms?
Tiago Gonçalves
Isabel Rio-Torto
Luís F. Teixeira
J. S. Cardoso
OODMedIm
103
41
0
26 Apr 2022
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo
  and Text
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
Pinaki Nath Chowdhury
A. Bhunia
Aneeshan Sain
Subhadeep Koley
Tao Xiang
Yi-Zhe Song
104
30
0
25 Apr 2022
Recurrent Affine Transformation for Text-to-image Synthesis
Recurrent Affine Transformation for Text-to-image Synthesis
Senmao Ye
Fei Liu
Mingkui Tan
54
27
0
22 Apr 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote
  Sensing Image Retrieval
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
99
140
0
21 Apr 2022
Attention in Reasoning: Dataset, Analysis, and Modeling
Attention in Reasoning: Dataset, Analysis, and Modeling
Shi Chen
Ming Jiang
Jinhui Yang
Qi Zhao
LRM
50
3
0
20 Apr 2022
Behind the Machine's Gaze: Neural Networks with Biologically-inspired
  Constraints Exhibit Human-like Visual Attention
Behind the Machine's Gaze: Neural Networks with Biologically-inspired Constraints Exhibit Human-like Visual Attention
Leo Schwinn
Doina Precup
Bjoern M. Eskofier
Dario Zanca
45
7
0
19 Apr 2022
Previous
123...131415...697071
Next