Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.06295
Cited By
Zero-Shot Audio Captioning Using Soft and Hard Prompts
10 June 2024
Yiming Zhang
Xuenan Xu
Ruoyi Du
Haohe Liu
Yuan Dong
Zheng-Hua Tan
Wenwu Wang
Zhanyu Ma
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Zero-Shot Audio Captioning Using Soft and Hard Prompts"
18 / 18 papers shown
Title
LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections
M. Jehanzeb Mirza
Leonid Karlinsky
Wei Lin
Mateusz Koziñski
Horst Possegger
Rogerio Feris
Horst Bischof
VLM
76
32
0
29 May 2023
Prefix tuning for automated audio captioning
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
54
45
0
30 Mar 2023
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Xinhao Mei
Chutong Meng
Haohe Liu
Qiuqiang Kong
Tom Ko
Chengqi Zhao
Mark D. Plumbley
Yuexian Zou
Wenwu Wang
87
209
0
30 Mar 2023
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Yusong Wu
Kai Chen
Tianyu Zhang
Yuchen Hui
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
CLIP
112
521
0
12 Nov 2022
Text-Only Training for Image Captioning using Noise-Injected CLIP
David Nukrai
Ron Mokady
Amir Globerson
VLM
CLIP
93
97
0
01 Nov 2022
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Zhongjie Ye
Helin Wang
Dongchao Yang
Yuexian Zou
71
28
0
12 Oct 2021
Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization
Andrew Koh
Fuzhao Xue
Chng Eng Siong
39
20
0
10 Aug 2021
Learning How to Ask: Querying LMs with Mixtures of Soft Prompts
Guanghui Qin
J. Eisner
55
546
0
14 Apr 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
215
4,244
0
01 Jan 2021
Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning
Daiki Takeuchi
Yuma Koizumi
Yasunori Ohishi
Noboru Harada
K. Kashino
38
27
0
24 Sep 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
180
1,075
0
21 Dec 2019
Clotho: An Audio Captioning Dataset
Konstantinos Drossos
Samuel Lipping
Tuomas Virtanen
87
388
0
21 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
538
24,422
0
26 Jul 2019
Unsupervised Data Augmentation for Consistency Training
Qizhe Xie
Zihang Dai
Eduard H. Hovy
Minh-Thang Luong
Quoc V. Le
124
2,314
0
29 Apr 2019
Improved Image Captioning via Policy Gradient optimization of SPIDEr
Siqi Liu
Zhenhai Zhu
Ning Ye
S. Guadarrama
Kevin Patrick Murphy
120
446
0
01 Dec 2016
SPICE: Semantic Propositional Image Caption Evaluation
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
90
1,914
0
29 Jul 2016
Microsoft COCO Captions: Data Collection and Evaluation Server
Xinlei Chen
Hao Fang
Nayeon Lee
Ramakrishna Vedantam
Saurabh Gupta
Piotr Dollar
C. L. Zitnick
203
2,475
0
01 Apr 2015
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
268
4,471
0
20 Nov 2014
1