Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.03715
Cited By
MeaCap: Memory-Augmented Zero-shot Image Captioning
6 March 2024
Zequn Zeng
Yan Xie
Hao Zhang
Chiyu Chen
Zhengjue Wang
Boli Chen
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MeaCap: Memory-Augmented Zero-shot Image Captioning"
13 / 13 papers shown
Title
Zero-Shot, But at What Cost? Unveiling the Hidden Overhead of MILS's LLM-CLIP Framework for Image Captioning
Yassir Benhammou
Alessandro Tiberio
Gabriel Trautmann
Suman Kalyan
MLLM
VLM
43
0
0
21 Apr 2025
The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning
Mingkai Tian
Guorong Li
Yuankai Qi
Amin Beheshti
J. Shi
Anton van den Hengel
Qingming Huang
VGen
32
0
0
31 Mar 2025
Explaining Domain Shifts in Language: Concept erasing for Interpretable Image Classification
Zequn Zeng
Yudi Su
Jianqiao Sun
Tiansheng Wen
Hao Zhang
Zhengjue Wang
Bo Chen
Hongwei Liu
Jiawei Ma
VLM
60
0
0
24 Mar 2025
LaVCa: LLM-assisted Visual Cortex Captioning
Takuya Matsuyama
Shinji Nishimoto
Yu Takagi
58
0
0
20 Feb 2025
DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding
Hao Wu
Zhihang Zhong
Xiao Sun
DiffM
70
0
0
02 Dec 2024
FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity
Hang Hua
Qing Liu
Lingzhi Zhang
Jing Shi
Zhifei Zhang
Yilin Wang
Jianming Zhang
Jiebo Luo
CoGe
VLM
90
6
0
23 Nov 2024
Historical Test-time Prompt Tuning for Vision Foundation Models
Jingyi Zhang
Jiaxing Huang
Xiaoqin Zhang
Ling Shao
Shijian Lu
VLM
30
4
0
27 Oct 2024
IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning
Soeun Lee
Si-Woo Kim
Taewhan Kim
Dong-Jin Kim
CLIP
VLM
26
0
0
26 Sep 2024
Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation
Cephas Mpungu
Qiyuan Chen
Wei Wei
Jiashuo Sun
G. Mapp
VLM
RALM
LRM
22
16
0
01 Aug 2024
HICEScore: A Hierarchical Metric for Image Captioning Evaluation
Zequn Zeng
Jianqiao Sun
Hao Zhang
Tiansheng Wen
Yudi Su
Yan Xie
Zhengjue Wang
Boli Chen
46
3
0
26 Jul 2024
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Wei Li
Linchao Zhu
Longyin Wen
Yi Yang
VLM
42
86
0
06 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,229
0
30 Jan 2023
Text-Only Training for Image Captioning using Noise-Injected CLIP
David Nukrai
Ron Mokady
Amir Globerson
VLM
CLIP
52
94
0
01 Nov 2022
1