ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.08897
  4. Cited By
Normalized and Geometry-Aware Self-Attention Network for Image
  Captioning

Normalized and Geometry-Aware Self-Attention Network for Image Captioning

19 March 2020
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
    ViT
ArXivPDFHTML

Papers citing "Normalized and Geometry-Aware Self-Attention Network for Image Captioning"

19 / 19 papers shown
Title
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
42
0
0
03 Apr 2025
An Ensemble Model with Attention Based Mechanism for Image Captioning
Israa Al Badarneh
Bassam Hammo
Omar Al-Kadi
45
3
0
28 Jan 2025
With a Little Help from your own Past: Prototypical Memory Networks for
  Image Captioning
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
55
19
0
23 Aug 2023
Reverse Stable Diffusion: What prompt was used to generate this image?
Reverse Stable Diffusion: What prompt was used to generate this image?
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
VLM
DiffM
34
6
0
02 Aug 2023
Learning to Collocate Visual-Linguistic Neural Modules for Image
  Captioning
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
37
10
0
04 Oct 2022
Paraphrasing Is All You Need for Novel Object Captioning
Paraphrasing Is All You Need for Novel Object Captioning
Cheng Yang
Yao-Hung Hubert Tsai
Wanshu Fan
Ruslan Salakhutdinov
Louis-Philippe Morency
Yu-Chiang Frank Wang
36
4
0
25 Sep 2022
Improving Speech Emotion Recognition Through Focus and Calibration
  Attention Mechanisms
Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Junghun Kim
Yoojin An
Jihie Kim
14
13
0
21 Aug 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual
  Features
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
25
106
0
20 Jul 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
17
87
0
14 Jun 2022
Dual Windows Are Significant: Learning from Mediastinal Window and
  Focusing on Lung Window
Dual Windows Are Significant: Learning from Mediastinal Window and Focusing on Lung Window
Qiuli Wang
Xin Tan
Chen Liu
13
0
0
08 Jun 2022
Collaborative Transformers for Grounded Situation Recognition
Collaborative Transformers for Grounded Situation Recognition
Junhyeong Cho
Youngseok Yoon
Suha Kwak
ViT
17
25
0
30 Mar 2022
CaMEL: Mean Teacher Learning for Image Captioning
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
25
27
0
21 Feb 2022
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does
  Matter
CLIP Meets Video Captioning: Concept-Aware Representation Learning Does Matter
Bang-ju Yang
Tong Zhang
Yuexian Zou
CLIP
25
20
0
30 Nov 2021
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic
  Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Yoad Tewel
Yoav Shalev
Idan Schwartz
Lior Wolf
VLM
34
192
0
29 Nov 2021
Self-Annotated Training for Controllable Image Captioning
Self-Annotated Training for Controllable Image Captioning
Zhangzi Zhu
Tianlei Wang
Hong Qu
22
2
0
16 Oct 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
55
254
0
14 Jul 2021
Progressively Normalized Self-Attention Network for Video Polyp
  Segmentation
Progressively Normalized Self-Attention Network for Video Polyp Segmentation
Ge-Peng Ji
Yu-Cheng Chou
Deng-Ping Fan
Geng Chen
H. Fu
Debesh Jha
Ling Shao
ViT
20
137
0
18 May 2021
Exploring Explicit and Implicit Visual Relationships for Image
  Captioning
Exploring Explicit and Implicit Visual Relationships for Image Captioning
Zeliang Song
Xiaofei Zhou
19
7
0
06 May 2021
Improving Image Captioning by Leveraging Intra- and Inter-layer Global
  Representation in Transformer Network
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
41
170
0
13 Dec 2020
1