Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.08897
Cited By
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
19 March 2020
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Normalized and Geometry-Aware Self-Attention Network for Image Captioning"
19 / 19 papers shown
Title
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
42
0
0
03 Apr 2025
An Ensemble Model with Attention Based Mechanism for Image Captioning
Israa Al Badarneh
Bassam Hammo
Omar Al-Kadi
45
3
0
28 Jan 2025
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
55
19
0
23 Aug 2023
Reverse Stable Diffusion: What prompt was used to generate this image?
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
VLM
DiffM
34
6
0
02 Aug 2023
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
37
10
0
04 Oct 2022
Paraphrasing Is All You Need for Novel Object Captioning
Cheng Yang
Yao-Hung Hubert Tsai
Wanshu Fan
Ruslan Salakhutdinov
Louis-Philippe Morency
Yu-Chiang Frank Wang
36
4
0
25 Sep 2022
Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Junghun Kim
Yoojin An
Jihie Kim
14
13
0
21 Aug 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
25
106
0
20 Jul 2022
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
17
87
0
14 Jun 2022
Dual Windows Are Significant: Learning from Mediastinal Window and Focusing on Lung Window
Qiuli Wang
Xin Tan
Chen Liu
13
0
0
08 Jun 2022
Collaborative Transformers for Grounded Situation Recognition
Junhyeong Cho
Youngseok Yoon
Suha Kwak
ViT
17
25
0
30 Mar 2022
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
25
27
0
21 Feb 2022
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter
Bang-ju Yang
Tong Zhang
Yuexian Zou
CLIP
25
20
0
30 Nov 2021
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Yoad Tewel
Yoav Shalev
Idan Schwartz
Lior Wolf
VLM
34
192
0
29 Nov 2021
Self-Annotated Training for Controllable Image Captioning
Zhangzi Zhu
Tianlei Wang
Hong Qu
22
2
0
16 Oct 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
55
254
0
14 Jul 2021
Progressively Normalized Self-Attention Network for Video Polyp Segmentation
Ge-Peng Ji
Yu-Cheng Chou
Deng-Ping Fan
Geng Chen
H. Fu
Debesh Jha
Ling Shao
ViT
20
137
0
18 May 2021
Exploring Explicit and Implicit Visual Relationships for Image Captioning
Zeliang Song
Xiaofei Zhou
19
7
0
06 May 2021
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
41
170
0
13 Dec 2020
1