Effects of Word-frequency based Pre- and Post- Processings for Audio
Captioning

Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning

24 September 2020

Yasunori Ohishi

Papers citing "Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning"

13 / 13 papers shown

Title
Audio-Language Datasets of Scenes and Events: A Survey Gijs Wijngaard Elia Formisano Michele Esposito M. Dumontier 81 2 0 10 Jan 2025
Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement Daiki Takeuchi Yasunori Ohishi Daisuke Niizumi Noboru Harada K. Kashino 32 6 0 23 Aug 2023
Automated Audio Captioning: An Overview of Recent Progress and New Challenges Xinhao Mei Xubo Liu Mark D. Plumbley Wenwu Wang 29 38 0 12 May 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning Xuenan Xu Zeyu Xie Mengyue Wu K. Yu 44 13 0 11 May 2022
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning Sreyan Ghosh Ashish Seth and Deepak Mittal Maneesh Singh S. Umesh SSL 27 6 0 25 Mar 2022
DECAR: Deep Clustering for learning general-purpose Audio Representations Sreyan Ghosh Sandesh V Katta Ashish Seth S. Umesh SSL 36 12 0 17 Oct 2021
Evaluating Off-the-Shelf Machine Listening and Natural Language Models for Automated Audio Captioning Benno Weck Xavier Favory Konstantinos Drossos Xavier Serra 23 8 0 14 Oct 2021
Audio Captioning Transformer Xinhao Mei Xubo Liu Qiushi Huang Mark D. Plumbley Wenwu Wang ViT 39 77 0 21 Jul 2021
Continual Learning for Automated Audio Captioning Using The Learning Without Forgetting Approach Jan van den Berg Konstantinos Drossos CLL 19 11 0 16 Jul 2021
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation Daisuke Niizumi Daiki Takeuchi Yasunori Ohishi Noboru Harada K. Kashino SSL 38 175 0 11 Mar 2021
Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval Yuma Koizumi Yasunori Ohishi Daisuke Niizumi Daiki Takeuchi Masahiro Yasuda 30 40 0 14 Dec 2020
WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information An Tran Konstantinos Drossos Tuomas Virtanen 47 19 0 21 Oct 2020
Acoustic Scene Classification D. Barchiesi D. Giannoulis D. Stowell Mark D. Plumbley 102 406 0 13 Nov 2014