Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
v1
v2
v3 (latest)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,520 papers shown
Title
Enhancing Scalability in Recommender Systems through Lottery Ticket Hypothesis and Knowledge Distillation-based Neural Network Pruning
R. Rajaram
Manoj Bharadhwaj
VS Vasan
N. Pervin
33
1
0
19 Jan 2024
Supervised Fine-tuning in turn Improves Visual Foundation Models
Xiaohu Jiang
Yixiao Ge
Yuying Ge
Dachuan Shi
Chun Yuan
Ying Shan
VLM
CLIP
94
9
0
18 Jan 2024
Jewelry Recognition via Encoder-Decoder Models
José M. Alcalde-Llergo
Enrique Yeguas-Bolivar
Andrea Zingoni
Alejandro Fuerte-Jurado
40
0
0
15 Jan 2024
Survey of Natural Language Processing for Education: Taxonomy, Systematic Review, and Future Trends
Yunshi Lan
Xinyuan Li
Hanyue Du
Xuesong Lu
Ming Gao
Weining Qian
Aoying Zhou
106
4
0
15 Jan 2024
HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced Diffusion Models
Hanzhang Wang
Haoran Wang
Jinze Yang
Zhongrui Yu
Zeke Xie
Lei Tian
Xinyan Xiao
Junjun Jiang
Xianming Liu
Mingming Sun
DiffM
58
1
0
11 Jan 2024
Complementary Information Mutual Learning for Multimodality Medical Image Segmentation
Chuyun Shen
Wenhao Li
Haoqing Chen
Xiaoling Wang
Fengping Zhu
Yuxin Li
Xiangfeng Wang
Bo Jin
85
3
0
05 Jan 2024
Object-oriented backdoor attack against image captioning
Meiling Li
Nan Zhong
Xinpeng Zhang
Zhenxing Qian
Sheng Li
65
8
0
05 Jan 2024
Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training
Longtian Qiu
Shan Ning
Xuming He
VLM
72
4
0
04 Jan 2024
Short-Term Multi-Horizon Line Loss Rate Forecasting of a Distribution Network Using Attention-GCN-LSTM
Jie Liu
Yijia Cao
Yong Li
Yixiu Guo
Wei Deng
107
1
0
19 Dec 2023
Satellite Captioning: Large Language Models to Augment Labeling
Grant Rosario
David Noever
209
0
0
18 Dec 2023
Dual Branch Network Towards Accurate Printed Mathematical Expression Recognition
Yuqing Wang
Zhenyu Weng
Zhaokun Zhou
Shuaijian Ji
Zhongjie Ye
Yuesheng Zhu
62
2
0
14 Dec 2023
See, Say, and Segment: Teaching LMMs to Overcome False Premises
Tsung-Han Wu
Giscard Biamby
David M. Chan
Lisa Dunlap
Ritwik Gupta
Xudong Wang
Joseph E. Gonzalez
Trevor Darrell
VLM
MLLM
115
21
0
13 Dec 2023
Pain Analysis using Adaptive Hierarchical Spatiotemporal Dynamic Imaging
Issam Serraoui
Eric Granger
Abdenour Hadid
Abdelmalik Taleb-Ahmed
61
0
0
12 Dec 2023
Medical Vision Language Pretraining: A survey
Prashant Shrestha
Sanskar Amgain
Bidur Khanal
Cristian A. Linte
Binod Bhattarai
VLM
100
17
0
11 Dec 2023
Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations
Xiao Zhang
David Yunis
Michael Maire
56
4
0
11 Dec 2023
PixLore: A Dataset-driven Approach to Rich Image Captioning
Diego Bonilla
VLM
24
0
0
08 Dec 2023
User-Aware Prefix-Tuning is a Good Learner for Personalized Image Captioning
Xuan Wang
Guanhong Wang
Wenhao Chai
Jiayu Zhou
Gaoang Wang
153
6
0
08 Dec 2023
Adaptive Dependency Learning Graph Neural Networks
Abishek Sriramulu
Nicolas Fourrier
Christoph Bergmeir
AI4TS
AI4CE
71
21
0
06 Dec 2023
Enhancing Image Captioning with Neural Models
Pooja Bhatnagar
Sai Mrunaal
Sachin Kamnure
VLM
46
0
0
01 Dec 2023
Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI
Xuan-Bac Nguyen
Xin Li
Pawan Sinha
Samee U. Khan
Khoa Luu
ViT
MedIm
96
0
0
30 Nov 2023
Improving Interpretation Faithfulness for Vision Transformers
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Di Wang
89
9
0
29 Nov 2023
EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension
Jiaxuan Li
D. Vo
Akihiro Sugimoto
Hideki Nakayama
KELM
VLM
104
25
0
27 Nov 2023
Model-agnostic Body Part Relevance Assessment for Pedestrian Detection
Maurice Günder
Sneha Banerjee
R. Sifa
Christian Bauckhage
FAtt
42
0
0
27 Nov 2023
WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images
Pingyi Chen
Honglin Li
Chenglu Zhu
Sunyi Zheng
Zhongyi Shui
Lin Yang
54
9
0
27 Nov 2023
DECap: Towards Generalized Explicit Caption Editing via Diffusion Mechanism
Zhen Wang
Xinyun Jiang
Jun Xiao
Tao Chen
Long Chen
DiffM
54
1
0
25 Nov 2023
Unified Medical Image Pre-training in Language-Guided Common Semantic Space
Xiaoxuan He
Yifan Yang
Xinyang Jiang
Xufang Luo
Haoji Hu
Siyun Zhao
Dongsheng Li
Yuqing Yang
Lili Qiu
80
2
0
24 Nov 2023
Causality is all you need
Ning Xu
Yifei Gao
Hongshuo Tian
Yongdong Zhang
An-An Liu
82
0
0
21 Nov 2023
Identifying DNA Sequence Motifs Using Deep Learning
Asmita Poddar
Vladimir Uzun
Elizabeth Tunbridge
W. Haerty
A. Nevado-Holgado
41
0
0
20 Nov 2023
System 2 Attention (is something you might need too)
Jason Weston
Sainbayar Sukhbaatar
RALM
OffRL
LRM
93
65
0
20 Nov 2023
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder
Abdelrahman Mohamed
Fakhraddin Alwajih
El Moatez Billah Nagoudi
Alcides Alcoba Inciarte
Muhammad Abdul-Mageed
VLM
MLLM
69
7
0
15 Nov 2023
The Heat is On: Thermal Facial Landmark Tracking
James Baker
CVBM
42
0
0
14 Nov 2023
FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and Design
Zhen Huang
Yihao Li
Dong Pei
Jiapeng Zhou
Xuliang Ning
Jianlin Han
Xiaoguang Han
Xuejun Chen
96
3
0
13 Nov 2023
Concept-wise Fine-tuning Matters in Preventing Negative Transfer
Yunqiao Yang
Long-Kai Huang
Ying Wei
75
2
0
12 Nov 2023
Automatic Report Generation for Histopathology images using pre-trained Vision Transformers
S. Sengupta
Donald E. Brown
VLM
MedIm
ViT
61
10
0
10 Nov 2023
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
237
161
0
09 Nov 2023
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models
Yuiga Wada
Kanta Kaneda
Komei Sugiura
68
4
0
07 Nov 2023
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI
Yaoxian Song
Penglei Sun
Haoyu Liu
Li Zhixu
Wei Song
Yanghua Xiao
Xiaofang Zhou
LM&Ro
120
16
0
07 Nov 2023
Complex Organ Mask Guided Radiology Report Generation
Tiancheng Gu
Dongnan Liu
Zhiyuan Li
Weidong Cai
MedIm
78
14
0
04 Nov 2023
RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence Learning
Ziyu Wang
Wenhao Jiang
Zixuan Zhang
Wei Tang
Junchi Yan
52
0
0
03 Nov 2023
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image Analysis
Yingshu Li
Yunyi Liu
Zhanyu Wang
Xinyu Liang
Lei Wang
Lingqiao Liu
Leyang Cui
Zhaopeng Tu
Longyue Wang
Luping Zhou
ELM
LM&MA
100
39
0
31 Oct 2023
Causal Interpretation of Self-Attention in Pre-Trained Transformers
R. Y. Rohekar
Yaniv Gurwicz
Shami Nisimov
MILM
71
19
0
31 Oct 2023
The Expressibility of Polynomial based Attention Scheme
Zhao Song
Guangyi Xu
Junze Yin
95
5
0
30 Oct 2023
Semi-Supervised Panoptic Narrative Grounding
Danni Yang
Jiayi Ji
Xiaoshuai Sun
Haowei Wang
Yinan Li
Yiwei Ma
Rongrong Ji
84
5
0
27 Oct 2023
Style-Aware Radiology Report Generation with RadGraph and Few-Shot Prompting
Benjamin Yan
Ruochen Liu
David E. Kuo
Subathra Adithan
Eduardo Pontes Reis
...
V. Venugopal
Chloe P. O'Connell
Agustina Saenz
Pranav Rajpurkar
Michael Moor
MedIm
59
27
0
26 Oct 2023
Cross-modal Active Complementary Learning with Self-refining Correspondence
Yang Qin
Yuan Sun
Dezhong Peng
Qiufeng Wang
Xiaocui Peng
Peng Hu
100
21
0
26 Oct 2023
FloCoDe: Unbiased Dynamic Scene Graph Generation with Temporal Consistency and Correlation Debiasing
Anant Khandelwal
130
2
0
24 Oct 2023
CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting
Lei Li
115
24
0
24 Oct 2023
PrivImage: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining
Kecen Li
Chen Gong
Zhixiang Li
Yuzhong Zhao
Xinwen Hou
Tianhao Wang
99
10
0
19 Oct 2023
Getting aligned on representational alignment
Ilia Sucholutsky
Lukas Muttenthaler
Adrian Weller
Andi Peng
Andreea Bobu
...
Thomas Unterthiner
Andrew Kyle Lampinen
Klaus-Robert Muller
M. Toneva
Thomas Griffiths
158
93
0
18 Oct 2023
Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World
Rujie Wu
Xiaojian Ma
Zhenliang Zhang
Wei Wang
Qing Li
Song-Chun Zhu
Yizhou Wang
LRM
VLM
153
9
0
16 Oct 2023
Previous
1
2
3
...
5
6
7
...
69
70
71
Next