Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.06095
Cited By
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
9 April 2024
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Modeling Duo: Towards a Universal Audio Pre-training Framework"
12 / 12 papers shown
Title
Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
Christos Plachouras
Julien Guinot
George Fazekas
Elio Quinton
Emmanouil Benetos
Johan Pauwels
131
0
0
09 May 2025
Assessing the Utility of Audio Foundation Models for Heart and Respiratory Sound Analysis
Daisuke Niizumi
Daiki Takeuchi
Masahiro Yasuda
Binh Thien Nguyen
Yasunori Ohishi
N. Harada
32
0
0
25 Apr 2025
Myna: Masking-Based Contrastive Learning of Musical Representations
Ori Yonay
Tracy Hammond
Tianbao Yang
AAML
61
0
0
20 Feb 2025
SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model
Carlos Hernandez-Olivan
Marc Delcroix
Tsubasa Ochiai
Daisuke Niizumi
Naohiro Tawara
Tomohiro Nakatani
Shoko Araki
34
2
0
19 Sep 2024
Effective Pre-Training of Audio Transformers for Sound Event Detection
Florian Schmid
T. Morocutti
Francesco Foscarin
Jan Schluter
Paul Primus
Gerhard Widmer
ViT
25
2
0
14 Sep 2024
SONICS: Synthetic Or Not -- Identifying Counterfeit Songs
Md Awsafur Rahman
Zaber Ibn Abdul Hakim
Najibul Haque Sarker
Bishmoy Paul
S. Fattah
40
7
0
26 Aug 2024
Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters
Junyi Peng
Themos Stafylakis
Rongzhi Gu
Oldvrich Plchot
Ladislav Movsner
Lukávs Burget
JanHonza'' vCernocký
34
22
0
28 Oct 2022
Pretraining Respiratory Sound Representations using Metadata and Contrastive Learning
Ilyass Moummad
Nicolas Farrugia
29
17
0
27 Oct 2022
Self-Distillation for Further Pre-training of Transformers
Seanie Lee
Minki Kang
Juho Lee
Sung Ju Hwang
Kenji Kawaguchi
45
8
0
30 Sep 2022
Understanding Collapse in Non-Contrastive Siamese Representation Learning
Alexander C. Li
Alexei A. Efros
Deepak Pathak
SSL
45
33
0
29 Sep 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
118
264
0
02 Feb 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
1