Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,508 papers shown
Title
Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition
Jia-Xin Zhuang
Jiabin Cai
Jianguo Zhang
Wei-Shi Zheng
Ruixuan Wang
16
10
0
19 Jul 2023
Embedded Heterogeneous Attention Transformer for Cross-lingual Image Captioning
Zijie Song
Zhenzhen Hu
Yuanen Zhou
Ye Zhao
Richang Hong
Meng Wang
21
2
0
19 Jul 2023
A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
Chaoyang Zhu
Long Chen
ObjD
VLM
31
32
0
18 Jul 2023
Human Action Recognition in Still Images Using ConViT
Seyed Rohollah Hosseyni
Sanaz Seyedin
Hasan Taheri
ViT
17
0
0
18 Jul 2023
GenAssist: Making Image Generation Accessible
Mina Huh
Yi-Hao Peng
Amy Pavel
DiffM
25
29
0
14 Jul 2023
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention and Text Attributes
Guoyun Tu
Ying Liu
Vladimir Vlassov
139
1
0
14 Jul 2023
Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
Yiren Jian
Chongyang Gao
Soroush Vosoughi
VLM
MLLM
32
25
0
13 Jul 2023
Is Task-Agnostic Explainable AI a Myth?
Alicja Chaszczewicz
26
2
0
13 Jul 2023
Reading Radiology Imaging Like The Radiologist
Yuhao Wang
MedIm
34
0
0
12 Jul 2023
DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph Optimization
Simin Chen
Shiyi Wei
Cong Liu
Wei Yang
24
6
0
11 Jul 2023
Undecimated Wavelet Transform for Word Embedded Semantic Marginal Autoencoder in Security improvement and Denoising different Languages
S. Shreyanth
18
0
0
06 Jul 2023
Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Bang-ju Yang
Fenglin Liu
Zheng Li
Qingyu Yin
Chenyu You
Bing Yin
Yuexian Zou
VLM
36
5
0
05 Jul 2023
Seeing in Words: Learning to Classify through Language Bottlenecks
Khalid Saifullah
Yuxin Wen
Jonas Geiping
Micah Goldblum
Tom Goldstein
VLM
21
2
0
29 Jun 2023
Variational latent discrete representation for time series modelling
Max H. Cohen
M. Charbit
Sylvain Le Corff
27
1
0
27 Jun 2023
Self-Supervised Image Captioning with CLIP
Chuanyang Jin
VLM
SSL
23
2
0
26 Jun 2023
Improving Reference-based Distinctive Image Captioning with Contrastive Rewards
Yangjun Mao
Jun Xiao
Dong Zhang
Meng Cao
Jian Shao
Yueting Zhuang
Long Chen
EGVM
32
9
0
25 Jun 2023
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation
Zihao Yue
Anwen Hu
Liang Zhang
Qin Jin
24
2
0
23 Jun 2023
Dense Video Object Captioning from Disjoint Supervision
Xingyi Zhou
Anurag Arnab
Chen Sun
Cordelia Schmid
31
3
0
20 Jun 2023
KiUT: Knowledge-injected U-Transformer for Radiology Report Generation
Zhongzhen Huang
Xiaofan Zhang
Shaoting Zhang
MedIm
25
51
0
20 Jun 2023
GraphGLOW: Universal and Generalizable Structure Learning for Graph Neural Networks
Wentao Zhao
Qitian Wu
Chenxiao Yang
Junchi Yan
24
12
0
20 Jun 2023
Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation
Shuo Chen
Yingjun Du
Pascal Mettes
Cees G. M. Snoek
OffRL
36
2
0
16 Jun 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
35
7
0
14 Jun 2023
Top-Down Framework for Weakly-supervised Grounded Image Captioning
Chen Cai
Suchen Wang
Kim-Hui Yap
Yi Wang
ObjD
23
3
0
13 Jun 2023
Multimodal Explainable Artificial Intelligence: A Comprehensive Review of Methodological Advances and Future Research Directions
N. Rodis
Christos Sardianos
Panagiotis I. Radoglou-Grammatikis
Panagiotis G. Sarigiannidis
Iraklis Varlamis
Georgios Th. Papadopoulos
25
22
0
09 Jun 2023
Customizing General-Purpose Foundation Models for Medical Report Generation
Bang-ju Yang
Asif Raza
Yuexian Zou
Tong Zhang
MedIm
25
11
0
09 Jun 2023
Object Detection with Transformers: A Review
Tahira Shehzadi
K. Hashmi
D. Stricker
Muhammad Zeshan Afzal
ViT
MU
23
28
0
07 Jun 2023
Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory
Aliki Anagnostopoulou
Mareike Hartmann
Daniel Sonntag
CLL
VLM
23
0
0
06 Jun 2023
Putting Humans in the Image Captioning Loop
Aliki Anagnostopoulou
Mareike Hartmann
Daniel Sonntag
VLM
32
1
0
06 Jun 2023
On the Role of Attention in Prompt-tuning
Samet Oymak
A. S. Rawat
Mahdi Soltanolkotabi
Christos Thrampoulidis
MLT
LRM
25
41
0
06 Jun 2023
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
Jianghui Wang
Yuxuan Wang
Dongyan Zhao
Zilong Zheng
46
1
0
04 Jun 2023
Table and Image Generation for Investigating Knowledge of Entities in Pre-trained Vision and Language Models
Hidetaka Kamigaito
Katsuhiko Hayashi
Taro Watanabe
VLM
15
1
0
03 Jun 2023
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Qiangchang Wang
Yilong Yin
35
0
0
02 Jun 2023
"Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning
Abisek Rajakumar Kalarani
P. Bhattacharyya
Niyati Chhaya
Sumit Shekhar
CoGe
VLM
19
9
0
01 Jun 2023
Cross-Domain Car Detection Model with Integrated Convolutional Block Attention Mechanism
Haoxuan Xu
Songning Lai
Xianyang Li
Y. Yang
ViT
23
15
0
31 May 2023
HGT: A Hierarchical GCN-Based Transformer for Multimodal Periprosthetic Joint Infection Diagnosis Using CT Images and Text
Ruiyang Li
Fujun Yang
Xianjie Liu
Hon-Yi Shi
30
0
0
29 May 2023
GBG++: A Fast and Stable Granular Ball Generation Method for Classification
Qin Xie
Qinghua Zhang
Shuyin Xia
Fan Zhao
Chengying Wu
Guoyin Wang
Weiping Ding
24
13
0
29 May 2023
FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
Noam Rotstein
David Bensaid
Shaked Brody
Roy Ganz
Ron Kimmel
VLM
26
27
0
28 May 2023
S4M: Generating Radiology Reports by A Single Model for Multiple Body Parts
Qi Chen
Yutong Xie
Biao Wu
Minh Nguyen Nhat To
James Ang
Qi Wu
13
1
0
26 May 2023
HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning
Chia-Wen Kuo
Z. Kira
37
21
0
25 May 2023
TOAST: Transfer Learning via Attention Steering
Baifeng Shi
Siyu Gai
Trevor Darrell
Xin Wang
39
9
0
24 May 2023
Evolutionary Algorithms in the Light of SGD: Limit Equivalence, Minima Flatness, and Transfer Learning
Andrei Kucharavy
R. Guerraoui
Ljiljana Dolamic
32
1
0
20 May 2023
DiffCap: Exploring Continuous Diffusion on Image Captioning
Yufeng He
Zefan Cai
Xu Gan
Baobao Chang
DiffM
34
5
0
20 May 2023
Explaining V1 Properties with a Biologically Constrained Deep Learning Architecture
Galen Pogoncheff
Jacob Granley
M. Beyeler
AAML
FAtt
11
9
0
18 May 2023
Emergent Communication with Attention
Ryokan Ri
Ryo Ueda
Jason Naradowsky
24
2
0
18 May 2023
A Video Is Worth 4096 Tokens: Verbalize Videos To Understand Them In Zero Shot
Aanisha Bhattacharya
Yaman Kumar Singla
Balaji Krishnamurthy
R. Shah
Changyou Chen
VGen
32
11
0
16 May 2023
PLIP: Language-Image Pre-training for Person Representation Learning
Jia-li Zuo
Jiahao Hong
Feng Zhang
Changqian Yu
Hanyu Zhou
Changxin Gao
Nong Sang
Jingdong Wang
VLM
MLLM
35
31
0
15 May 2023
Mask to reconstruct: Cooperative Semantics Completion for Video-text Retrieval
Han Fang
Zhifei Yang
Xianghao Zang
Chao Ban
Hao Sun
VGen
34
2
0
13 May 2023
Automatic Radiology Report Generation by Learning with Increasingly Hard Negatives
Bhanu Prakash Voutharoja
Lei Wang
Luping Zhou
MedIm
33
8
0
11 May 2023
Learning the Visualness of Text Using Large Vision-Language Models
Gaurav Verma
Ryan A. Rossi
Chris Tensmeyer
Jiuxiang Gu
A. Nenkova
VLM
14
0
0
11 May 2023
Clothes-Invariant Feature Learning by Causal Intervention for Clothes-Changing Person Re-identification
Xulin Li
Yan Lu
B. Liu
Yuenan Hou
Yating Liu
Qi Chu
Wanli Ouyang
Nenghai Yu
OOD
CML
35
4
0
10 May 2023
Previous
1
2
3
...
7
8
9
...
69
70
71
Next