ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1502.03044
  4. Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual
  Attention

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
    DiffM
ArXivPDFHTML

Papers citing "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"

50 / 3,508 papers shown
Title
Class Attention to Regions of Lesion for Imbalanced Medical Image
  Recognition
Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition
Jia-Xin Zhuang
Jiabin Cai
Jianguo Zhang
Wei-Shi Zheng
Ruixuan Wang
16
10
0
19 Jul 2023
Embedded Heterogeneous Attention Transformer for Cross-lingual Image
  Captioning
Embedded Heterogeneous Attention Transformer for Cross-lingual Image Captioning
Zijie Song
Zhenzhen Hu
Yuanen Zhou
Ye Zhao
Richang Hong
Meng Wang
21
2
0
19 Jul 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present,
  and Future
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
31
32
0
18 Jul 2023
Human Action Recognition in Still Images Using ConViT
Human Action Recognition in Still Images Using ConViT
Seyed Rohollah Hosseyni
Sanaz Seyedin
Hasan Taheri
ViT
17
0
0
18 Jul 2023
GenAssist: Making Image Generation Accessible
GenAssist: Making Image Generation Accessible
Mina Huh
Yi-Hao Peng
Amy Pavel
DiffM
25
29
0
14 Jul 2023
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention
  and Text Attributes
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention and Text Attributes
Guoyun Tu
Ying Liu
Vladimir Vlassov
139
1
0
14 Jul 2023
Bootstrapping Vision-Language Learning with Decoupled Language
  Pre-training
Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
Yiren Jian
Chongyang Gao
Soroush Vosoughi
VLM
MLLM
32
25
0
13 Jul 2023
Is Task-Agnostic Explainable AI a Myth?
Is Task-Agnostic Explainable AI a Myth?
Alicja Chaszczewicz
26
2
0
13 Jul 2023
Reading Radiology Imaging Like The Radiologist
Reading Radiology Imaging Like The Radiologist
Yuhao Wang
MedIm
34
0
0
12 Jul 2023
DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph
  Optimization
DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph Optimization
Simin Chen
Shiyi Wei
Cong Liu
Wei Yang
24
6
0
11 Jul 2023
Undecimated Wavelet Transform for Word Embedded Semantic Marginal
  Autoencoder in Security improvement and Denoising different Languages
Undecimated Wavelet Transform for Word Embedded Semantic Marginal Autoencoder in Security improvement and Denoising different Languages
S. Shreyanth
18
0
0
06 Jul 2023
Multimodal Prompt Learning for Product Title Generation with Extremely
  Limited Labels
Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Bang-ju Yang
Fenglin Liu
Zheng Li
Qingyu Yin
Chenyu You
Bing Yin
Yuexian Zou
VLM
36
5
0
05 Jul 2023
Seeing in Words: Learning to Classify through Language Bottlenecks
Seeing in Words: Learning to Classify through Language Bottlenecks
Khalid Saifullah
Yuxin Wen
Jonas Geiping
Micah Goldblum
Tom Goldstein
VLM
21
2
0
29 Jun 2023
Variational latent discrete representation for time series modelling
Variational latent discrete representation for time series modelling
Max H. Cohen
M. Charbit
Sylvain Le Corff
27
1
0
27 Jun 2023
Self-Supervised Image Captioning with CLIP
Self-Supervised Image Captioning with CLIP
Chuanyang Jin
VLM
SSL
23
2
0
26 Jun 2023
Improving Reference-based Distinctive Image Captioning with Contrastive
  Rewards
Improving Reference-based Distinctive Image Captioning with Contrastive Rewards
Yangjun Mao
Jun Xiao
Dong Zhang
Meng Cao
Jian Shao
Yueting Zhuang
Long Chen
EGVM
32
9
0
25 Jun 2023
Learning Descriptive Image Captioning via Semipermeable Maximum
  Likelihood Estimation
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation
Zihao Yue
Anwen Hu
Liang Zhang
Qin Jin
24
2
0
23 Jun 2023
Dense Video Object Captioning from Disjoint Supervision
Dense Video Object Captioning from Disjoint Supervision
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
31
3
0
20 Jun 2023
KiUT: Knowledge-injected U-Transformer for Radiology Report Generation
KiUT: Knowledge-injected U-Transformer for Radiology Report Generation
Zhongzhen Huang
Xiaofan Zhang
Shaoting Zhang
MedIm
25
51
0
20 Jun 2023
GraphGLOW: Universal and Generalizable Structure Learning for Graph
  Neural Networks
GraphGLOW: Universal and Generalizable Structure Learning for Graph Neural Networks
Wentao Zhao
Qitian Wu
Chenxiao Yang
Junchi Yan
24
12
0
20 Jun 2023
Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph
  Generation
Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation
Shuo Chen
Yingjun Du
Pascal Mettes
Cees G. M. Snoek
OffRL
36
2
0
16 Jun 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large
  Language Models
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
35
7
0
14 Jun 2023
Top-Down Framework for Weakly-supervised Grounded Image Captioning
Top-Down Framework for Weakly-supervised Grounded Image Captioning
Chen Cai
Suchen Wang
Kim-Hui Yap
Yi Wang
ObjD
23
3
0
13 Jun 2023
Multimodal Explainable Artificial Intelligence: A Comprehensive Review
  of Methodological Advances and Future Research Directions
Multimodal Explainable Artificial Intelligence: A Comprehensive Review of Methodological Advances and Future Research Directions
N. Rodis
Christos Sardianos
Panagiotis I. Radoglou-Grammatikis
Panagiotis G. Sarigiannidis
Iraklis Varlamis
Georgios Th. Papadopoulos
25
22
0
09 Jun 2023
Customizing General-Purpose Foundation Models for Medical Report
  Generation
Customizing General-Purpose Foundation Models for Medical Report Generation
Bang-ju Yang
Asif Raza
Yuexian Zou
Tong Zhang
MedIm
25
11
0
09 Jun 2023
Object Detection with Transformers: A Review
Object Detection with Transformers: A Review
Tahira Shehzadi
K. Hashmi
D. Stricker
Muhammad Zeshan Afzal
ViT
MU
23
28
0
07 Jun 2023
Towards Adaptable and Interactive Image Captioning with Data
  Augmentation and Episodic Memory
Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory
Aliki Anagnostopoulou
Mareike Hartmann
Daniel Sonntag
CLL
VLM
23
0
0
06 Jun 2023
Putting Humans in the Image Captioning Loop
Putting Humans in the Image Captioning Loop
Aliki Anagnostopoulou
Mareike Hartmann
Daniel Sonntag
VLM
32
1
0
06 Jun 2023
On the Role of Attention in Prompt-tuning
On the Role of Attention in Prompt-tuning
Samet Oymak
A. S. Rawat
Mahdi Soltanolkotabi
Christos Thrampoulidis
MLT
LRM
25
41
0
06 Jun 2023
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order
  Learning
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
Jianghui Wang
Yuxuan Wang
Dongyan Zhao
Zilong Zheng
46
1
0
04 Jun 2023
Table and Image Generation for Investigating Knowledge of Entities in
  Pre-trained Vision and Language Models
Table and Image Generation for Investigating Knowledge of Entities in Pre-trained Vision and Language Models
Hidetaka Kamigaito
Katsuhiko Hayashi
Taro Watanabe
VLM
15
1
0
03 Jun 2023
Recent Advances of Local Mechanisms in Computer Vision: A Survey and
  Outlook of Recent Work
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Qiangchang Wang
Yilong Yin
35
0
0
02 Jun 2023
"Let's not Quote out of Context": Unified Vision-Language Pretraining
  for Context Assisted Image Captioning
"Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning
Abisek Rajakumar Kalarani
P. Bhattacharyya
Niyati Chhaya
Sumit Shekhar
CoGe
VLM
19
9
0
01 Jun 2023
Cross-Domain Car Detection Model with Integrated Convolutional Block
  Attention Mechanism
Cross-Domain Car Detection Model with Integrated Convolutional Block Attention Mechanism
Haoxuan Xu
Songning Lai
Xianyang Li
Y. Yang
ViT
23
15
0
31 May 2023
HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic Joint Infection Diagnosis Using CT Images and Text
Ruiyang Li
Fujun Yang
Xianjie Liu
Hon-Yi Shi
30
0
0
29 May 2023
GBG++: A Fast and Stable Granular Ball Generation Method for Classification
GBG++: A Fast and Stable Granular Ball Generation Method for Classification
Qin Xie
Qinghua Zhang
Shuyin Xia
Fan Zhao
Chengying Wu
Guoyin Wang
Weiping Ding
24
13
0
29 May 2023
FuseCap: Leveraging Large Language Models for Enriched Fused Image
  Captions
FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
Noam Rotstein
David Bensaid
Shaked Brody
Roy Ganz
Ron Kimmel
VLM
26
27
0
28 May 2023
S4M: Generating Radiology Reports by A Single Model for Multiple Body
  Parts
S4M: Generating Radiology Reports by A Single Model for Multiple Body Parts
Qi Chen
Yutong Xie
Biao Wu
Minh Nguyen Nhat To
James Ang
Qi Wu
13
1
0
26 May 2023
HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning
HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning
Chia-Wen Kuo
Z. Kira
37
21
0
25 May 2023
TOAST: Transfer Learning via Attention Steering
TOAST: Transfer Learning via Attention Steering
Baifeng Shi
Siyu Gai
Trevor Darrell
Xin Wang
39
9
0
24 May 2023
Evolutionary Algorithms in the Light of SGD: Limit Equivalence, Minima
  Flatness, and Transfer Learning
Evolutionary Algorithms in the Light of SGD: Limit Equivalence, Minima Flatness, and Transfer Learning
Andrei Kucharavy
R. Guerraoui
Ljiljana Dolamic
32
1
0
20 May 2023
DiffCap: Exploring Continuous Diffusion on Image Captioning
DiffCap: Exploring Continuous Diffusion on Image Captioning
Yufeng He
Zefan Cai
Xu Gan
Baobao Chang
DiffM
34
5
0
20 May 2023
Explaining V1 Properties with a Biologically Constrained Deep Learning
  Architecture
Explaining V1 Properties with a Biologically Constrained Deep Learning Architecture
Galen Pogoncheff
Jacob Granley
M. Beyeler
AAML
FAtt
11
9
0
18 May 2023
Emergent Communication with Attention
Emergent Communication with Attention
Ryokan Ri
Ryo Ueda
Jason Naradowsky
24
2
0
18 May 2023
A Video Is Worth 4096 Tokens: Verbalize Videos To Understand Them In
  Zero Shot
A Video Is Worth 4096 Tokens: Verbalize Videos To Understand Them In Zero Shot
Aanisha Bhattacharya
Yaman Kumar Singla
Balaji Krishnamurthy
R. Shah
Changyou Chen
VGen
32
11
0
16 May 2023
PLIP: Language-Image Pre-training for Person Representation Learning
PLIP: Language-Image Pre-training for Person Representation Learning
Jia-li Zuo
Jiahao Hong
Feng Zhang
Changqian Yu
Hanyu Zhou
Changxin Gao
Nong Sang
Jingdong Wang
VLM
MLLM
35
31
0
15 May 2023
Mask to reconstruct: Cooperative Semantics Completion for Video-text
  Retrieval
Mask to reconstruct: Cooperative Semantics Completion for Video-text Retrieval
Han Fang
Zhifei Yang
Xianghao Zang
Chao Ban
Hao Sun
VGen
34
2
0
13 May 2023
Automatic Radiology Report Generation by Learning with Increasingly Hard
  Negatives
Automatic Radiology Report Generation by Learning with Increasingly Hard Negatives
Bhanu Prakash Voutharoja
Lei Wang
Luping Zhou
MedIm
33
8
0
11 May 2023
Learning the Visualness of Text Using Large Vision-Language Models
Learning the Visualness of Text Using Large Vision-Language Models
Gaurav Verma
Ryan A. Rossi
Chris Tensmeyer
Jiuxiang Gu
A. Nenkova
VLM
14
0
0
11 May 2023
Clothes-Invariant Feature Learning by Causal Intervention for
  Clothes-Changing Person Re-identification
Clothes-Invariant Feature Learning by Causal Intervention for Clothes-Changing Person Re-identification
Xulin Li
Yan Lu
B. Liu
Yuenan Hou
Yating Liu
Qi Chu
Wanli Ouyang
Nenghai Yu
OOD
CML
35
4
0
10 May 2023
Previous
123...789...697071
Next