Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1502.03044
Cited By
v1
v2
v3 (latest)
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
10 February 2015
Ke Xu
Jimmy Ba
Ryan Kiros
Kyunghyun Cho
Aaron Courville
Ruslan Salakhutdinov
R. Zemel
Yoshua Bengio
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
50 / 3,520 papers shown
Title
A Medical Semantic-Assisted Transformer for Radiographic Report Generation
Zhanyu Wang
Mingkang Tang
Lei Wang
Xiu Li
Luping Zhou
ViT
MedIm
81
58
0
22 Aug 2022
Mix-Pooling Strategy for Attention Mechanism
Shan Zhong
Wushao Wen
Jinghui Qin
82
3
0
22 Aug 2022
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
Haoran Wang
Dongliang He
Wenhao Wu
Boyang Xia
Min Yang
Fu Li
YunLong Yu
Zhong Ji
Errui Ding
Jingdong Wang
64
23
0
21 Aug 2022
Offline Handwritten Mathematical Recognition using Adversarial Learning and Transformers
U. Thakur
Anuj Sharma
OffRL
43
5
0
20 Aug 2022
Booster-SHOT: Boosting Stacked Homography Transformations for Multiview Pedestrian Detection with Attention
Jinwoo Hwang
Philipp Benz
Tae-Hoon Kim
ViT
75
3
0
19 Aug 2022
Sequence Prediction Under Missing Data : An RNN Approach Without Imputation
Soumen Pachal
Avinash Achar
AI4TS
36
4
0
18 Aug 2022
Look in Different Views: Multi-Scheme Regression Guided Cell Instance Segmentation
Menghao Li
W. Feng
Shuchang Lyu
Lijiang Chen
Qi Zhao
73
0
0
17 Aug 2022
Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
128
22
0
13 Aug 2022
A Means-End Account of Explainable Artificial Intelligence
O. Buchholz
XAI
76
12
0
09 Aug 2022
Distinctive Image Captioning via CLIP Guided Group Optimization
Youyuan Zhang
Jiuniu Wang
Hao Wu
Wenjia Xu
VLM
103
8
0
08 Aug 2022
Sparse Attentive Memory Network for Click-through Rate Prediction with Long Sequences
Qianying Lin
Wen-Ji Zhou
Yanshi Wang
Qing Da
Qingguo Chen
Bing Wang
VLM
43
9
0
08 Aug 2022
Fine-Grained Semantically Aligned Vision-Language Pre-Training
Juncheng Li
Xin He
Longhui Wei
Long Qian
Linchao Zhu
Lingxi Xie
Yueting Zhuang
Qi Tian
Siliang Tang
VLM
106
80
0
04 Aug 2022
Benchmarking Visual-Inertial Deep Multimodal Fusion for Relative Pose Regression and Odometry-aided Absolute Pose Regression
Felix Ott
Nisha Lakshmana Raichur
David Rügamer
Tobias Feigl
Heiko Neumann
Bernd Bischl
Christopher Mutschler
86
2
0
01 Aug 2022
Object-ABN: Learning to Generate Sharp Attention Maps for Action Recognition
Tomoya Nitta
Tsubasa Hirakawa
H. Fujiyoshi
Toru Tamaki
100
0
0
27 Jul 2022
Multi-Attention Network for Compressed Video Referring Object Segmentation
Weidong Chen
Dexiang Hong
Yuankai Qi
Zhenjun Han
Shuhui Wang
Laiyun Qing
Qingming Huang
Guorong Li
VOS
55
40
0
26 Jul 2022
Innovations in Neural Data-to-text Generation: A Survey
Mandar Sharma
Ajay K. Gogineni
Naren Ramakrishnan
103
10
0
25 Jul 2022
Improved Super Resolution of MR Images Using CNNs and Vision Transformers
Dwarikanath Mahapatra
SupR
ViT
MedIm
66
5
0
24 Jul 2022
When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition
Bohan Li
Ye Yuan
Dingkang Liang
Xiao-Chang Liu
Zhilong Ji
Jinfeng Bai
Wenyu Liu
Xiang Bai
88
50
0
23 Jul 2022
Rethinking the Reference-based Distinctive Image Captioning
Yangjun Mao
Long Chen
Zhihong Jiang
Dong Zhang
Zhimeng Zhang
Jian Shao
Jun Xiao
DiffM
89
22
0
22 Jul 2022
Zero-Shot Video Captioning with Evolving Pseudo-Tokens
Yoad Tewel
Yoav Shalev
Roy Nadler
Idan Schwartz
Lior Wolf
70
27
0
22 Jul 2022
Efficient Modeling of Future Context for Image Captioning
Zhengcong Fei
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
76
15
0
22 Jul 2022
EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer
Chenyu Yang
W. He
Yingqing Xu
Yang Gao
DiffM
71
27
0
20 Jul 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
89
114
0
20 Jul 2022
Explicit Image Caption Editing
Zhen Wang
Long Chen
Wenbo Ma
G. Han
Yulei Niu
Jian Shao
Jun Xiao
72
12
0
20 Jul 2022
Deep Analysis of Visual Product Reviews
Chandranath Adak
S. Chattopadhyay
Muhammad Saqib
25
2
0
19 Jul 2022
Relational Future Captioning Model for Explaining Likely Collisions in Daily Tasks
Motonari Kambara
K. Sugiura
63
6
0
19 Jul 2022
Superficial White Matter Analysis: An Efficient Point-cloud-based Deep Learning Framework with Supervised Contrastive Learning for Consistent Tractography Parcellation across Populations and dMRI Acquisitions
Tengfei Xue
Fan Zhang
Chaoyi Zhang
Yuqian Chen
Yang Song
A. Golby
N. Makris
Yogesh Rathi
Weidong (Tom) Cai
L. O’Donnell
91
38
0
18 Jul 2022
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Sauradip Nag
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
VLM
79
68
0
17 Jul 2022
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Zekun Li
Zhengyang Geng
Zhao Kang
Wenyu Chen
Yibo Yang
102
37
0
13 Jul 2022
Skeletal Human Action Recognition using Hybrid Attention based Graph Convolutional Network
Hao Xing
Darius Burschka
GNN
3DH
64
7
0
12 Jul 2022
Trusted Multi-Scale Classification Framework for Whole Slide Image
Ming Feng
Kele Xu
Na Wu
Weiquan Huang
Yan Bai
Changjian Wang
Huaimin Wang
67
6
0
12 Jul 2022
CoMER: Modeling Coverage for Transformer-based Handwritten Mathematical Expression Recognition
Wenqi Zhao
Liang Gao
ViT
73
29
0
10 Jul 2022
Horizontal and Vertical Attention in Transformers
Litao Yu
Shuai Liu
ViT
49
1
0
10 Jul 2022
Towards Multimodal Vision-Language Models Generating Non-Generic Text
Wes Robbins
Zanyar Zohourianshahzadi
Jugal Kalita
56
1
0
09 Jul 2022
Seasonal Encoder-Decoder Architecture for Forecasting
Avinash Achar
Soumen Pachal
BDL
AI4TS
23
0
0
08 Jul 2022
Exploring the sequence length bottleneck in the Transformer for Image Captioning
Jiapeng Hu
Roberto Cavicchioli
Alessandro Capotondi
ViT
70
3
0
07 Jul 2022
Scene-Aware Prompt for Multi-modal Dialogue Understanding and Generation
Bin Li
Yixuan Weng
Ziyu Ma
Bin Sun
Shutao Li
VLM
36
2
0
05 Jul 2022
Vision-and-Language Pretraining
Thong Nguyen
Cong-Duy Nguyen
Xiaobao Wu
See-Kiong Ng
Anh Tuan Luu
VLM
CLIP
72
2
0
05 Jul 2022
Are metrics measuring what they should? An evaluation of image captioning task metrics
Othón González-Chávez
Guillermo Ruiz
Daniela Moctezuma
Tania A. Ramirez-delreal
73
9
0
04 Jul 2022
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts
Chuan Guo
Xinxin Xuo
Sen Wang
Li Cheng
VGen
197
244
0
04 Jul 2022
Multi-scale alignment and Spatial ROI Module for COVID-19 Diagnosis
Hongyan Xu
Dadong Wang
Arcot Sowmya
89
1
0
04 Jul 2022
Attributed Abnormality Graph Embedding for Clinically Accurate X-Ray Report Generation
Sixing Yan
William K. Cheung
Keith W H Chiu
Terence M. Tong
Charles K. Cheung
Simon See
MedIm
87
17
0
04 Jul 2022
PhilaeX: Explaining the Failure and Success of AI Models in Malware Detection
Zhi Lu
V. Thing
AAML
25
5
0
02 Jul 2022
Rethinking Query-Key Pairwise Interactions in Vision Transformers
Cheng-rong Li
Yangxin Liu
72
0
0
01 Jul 2022
Personalized Showcases: Generating Multi-Modal Explanations for Recommendations
An Yan
Zhankui He
Jiacheng Li
Tianyang Zhang
Julian McAuley
88
37
0
30 Jun 2022
Overview of Deep Learning-based CSI Feedback in Massive MIMO Systems
Jiajia Guo
Chao-Kai Wen
Shi Jin
Geoffrey Ye Li
94
155
0
29 Jun 2022
ZoDIAC: Zoneout Dropout Injection Attention Calculation
Zanyar Zohourianshahzadi
Jugal Kalita
96
0
0
28 Jun 2022
Consistency-preserving Visual Question Answering in Medical Imaging
Sergio Tascon-Morales
Pablo Márquez-Neila
Raphael Sznitman
MedIm
92
12
0
27 Jun 2022
Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness
Hideaki Iiduka
73
5
0
27 Jun 2022
Representative Teacher Keys for Knowledge Distillation Model Compression Based on Attention Mechanism for Image Classification
Jun-Teng Yang
Sheng-Che Kao
S. Huang
45
0
0
26 Jun 2022
Previous
1
2
3
...
12
13
14
...
69
70
71
Next