Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.00475
Cited By
v1
v2 (latest)
FSD50K: An Open Dataset of Human-Labeled Sound Events
1 October 2020
Eduardo Fonseca
Xavier Favory
Jordi Pons
F. Font
Xavier Serra
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FSD50K: An Open Dataset of Human-Labeled Sound Events"
50 / 78 papers shown
Title
X-ARES: A Comprehensive Framework for Assessing Audio Encoder Performance
Junbo Zhang
Heinrich Dinkel
Yadong Niu
Chenyu Liu
Si Cheng
Anbei Zhao
Jian Luan
153
0
0
22 May 2025
Large Language Models Implicitly Learn to See and Hear Just By Reading
Prateek Verma
Mert Pilanci
176
0
0
20 May 2025
AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement
Junan Zhang
Jing Yang
Zihao Fang
Yansen Wang
Zehua Zhang
Zhuo Wang
Fan Fan
Zhikai Wu
125
4
0
26 Jan 2025
Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement
Jae-Sung Bae
Anastasia Kuznetsova
Dinesh Manocha
John Hershey
Trausti Kristjansson
Minje Kim
113
0
0
23 Jan 2025
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
149
2
0
10 Jan 2025
Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Wei Guo
Heng Wang
Jianbo Ma
Weidong Cai
DiffM
153
5
0
23 Nov 2024
Do Audio-Language Models Understand Linguistic Variations?
Ramaneswaran Selvakumar
Sonal Kumar
Hemant Kumar Giri
Nishit Anand
Ashish Seth
Sreyan Ghosh
Dinesh Manocha
AuLLM
VLM
116
1
0
21 Oct 2024
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach
Rory Young
Nicolas Pugeault
AAML
102
4
0
14 Oct 2024
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Sreyan Ghosh
Sonal Kumar
Zhifeng Kong
Rafael Valle
Bryan Catanzaro
Dinesh Manocha
DiffM
104
3
0
02 Oct 2024
Exploring Text-Queried Sound Event Detection with Audio Source Separation
Han Yin
Jisheng Bai
Yang Xiao
Hui Wang
Siqi Zheng
Yafeng Chen
Rohan Kumar Das
Chong Deng
Jianfeng Chen
78
3
0
20 Sep 2024
AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions
Yun Wang
Hangting Chen
Dongchao Yang
Zhiyong Wu
Xixin Wu
DiffM
88
2
0
19 Sep 2024
Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models
Potsawee Manakul
Guangzhi Sun
Warit Sirichotedumrong
Kasima Tharnpipitchai
Kunat Pipatanakul
AuLLM
86
7
0
17 Sep 2024
High-Resolution Speech Restoration with Latent Diffusion Model
Tushar Dhyani
Florian Lux
Michele Mancusi
Giorgio Fabbro
Fritz Hohl
Ngoc Thang Vu
DiffM
113
0
0
17 Sep 2024
MambaFoley: Foley Sound Generation using Selective State-Space Models
Marco Furio Colombo
Francesca Ronchini
Luca Comanducci
Fabio Antonacci
Mamba
63
1
0
13 Sep 2024
Salmon: A Suite for Acoustic Language Model Evaluation
Gallil Maimon
Amit Roth
Yossi Adi
ELM
AuLLM
115
6
0
11 Sep 2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Shengpeng Ji
Ziyue Jiang
Xize Cheng
Yifu Chen
Minghui Fang
...
Rongjie Huang
Yidi Jiang
Qian Chen
Zhou Zhao
Zhou Zhao
VLM
119
43
0
29 Aug 2024
Audio xLSTMs: Learning Self-Supervised Audio Representations with xLSTMs
Sarthak Yadav
Sergios Theodoridis
Zheng-Hua Tan
87
3
0
29 Aug 2024
Contrastive Learning from Synthetic Audio Doppelgängers
Manuel Cherep
Nikhil Singh
83
1
0
09 Jun 2024
Listenable Maps for Zero-Shot Audio Classifiers
Francesco Paissan
Luca Della Libera
Mirco Ravanelli
Cem Subakan
83
4
0
27 May 2024
Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks
Eduardo Fonseca
Andrés Ferraro
Xavier Serra
AI4TS
101
9
0
01 Jul 2021
The Benefit Of Temporally-Strong Labels In Audio Event Classification
Shawn Hershey
D. Ellis
Eduardo Fonseca
A. Jansen
Caroline Liu
Channing Moore
Manoj Plakal
58
104
0
14 May 2021
USM-SED - A Dataset for Polyphonic Sound Event Detection in Urban Sound Monitoring Scenarios
J. Abeßer
45
6
0
06 May 2021
Self-Supervised Learning from Automatically Separated Sound Scenes
Eduardo Fonseca
A. Jansen
D. Ellis
Scott Wisdom
Marco Tagliasacchi
J. Hershey
Manoj Plakal
Shawn Hershey
R. C. Moore
Xavier Serra
SSL
69
13
0
05 May 2021
Unsupervised Contrastive Learning of Sound Event Representations
Eduardo Fonseca
Diego Ortego
Kevin McGuinness
Noel E. O'Connor
Xavier Serra
SSL
68
65
0
15 Nov 2020
What's All the FUSS About Free Universal Sound Separation Data?
Scott Wisdom
Hakan Erdogan
D. Ellis
Romain Serizel
Nicolas Turpault
Eduardo Fonseca
Justin Salamon
Prem Seetharaman
J. Hershey
69
82
0
02 Nov 2020
SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context
M. Cartwright
J. Cramer
Ana Elisa Méndez Méndez
Yu Wang
Ho-Hsiang Wu
...
Graham Dove
C. Mydlarz
Justin Salamon
O. Nov
J. P. Bello
53
36
0
11 Sep 2020
Improving Sound Event Detection In Domestic Environments Using Sound Separation
Nicolas Turpault
Scott Wisdom
Hakan Erdogan
J. Hershey
Romain Serizel
Eduardo Fonseca
Prem Seetharaman
Justin Salamon
79
49
0
08 Jul 2020
COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations
Xavier Favory
Konstantinos Drossos
Tuomas Virtanen
Xavier Serra
105
32
0
15 Jun 2020
Are we done with ImageNet?
Lucas Beyer
Olivier J. Hénaff
Alexander Kolesnikov
Xiaohua Zhai
Aaron van den Oord
VLM
124
401
0
12 Jun 2020
Evaluation of CNN-based Automatic Music Tagging Models
Minz Won
Andrés Ferraro
Dmitry Bogdanov
Xavier Serra
VLM
363
100
0
01 Jun 2020
Audio and Contact Microphones for Cough Detection
Thomas Drugman
J. Urbain
N. Bauwens
Ricardo Chessini
A. Aubriot
P. Lebecque
Thierry Dutoit
18
33
0
10 May 2020
Addressing Missing Labels in Large-Scale Sound Event Recognition Using a Teacher-Student Framework With Loss Masking
Eduardo Fonseca
Shawn Hershey
Manoj Plakal
D. Ellis
A. Jansen
R. C. Moore
Xavier Serra
NoLa
90
23
0
02 May 2020
VGGSound: A Large-scale Audio-Visual Dataset
Honglie Chen
Weidi Xie
Andrea Vedaldi
Andrew Zisserman
89
577
0
29 Apr 2020
Search Result Clustering in Collaborative Sound Collections
Xavier Favory
F. Font
Xavier Serra
15
5
0
08 Apr 2020
Active Learning for Sound Event Detection
Shuyang Zhao
Toni Heittola
Tuomas Virtanen
44
27
0
12 Feb 2020
Learning with Out-of-Distribution Data for Audio Classification
Turab Iqbal
Yin Cao
Qiuqiang Kong
Mark D. Plumbley
Wenwu Wang
OODD
31
17
0
11 Feb 2020
Limitations of weak labels for embedding and tagging
Nicolas Turpault
Romain Serizel
Emmanuel Vincent
51
9
0
05 Feb 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
192
1,082
0
21 Dec 2019
VOICe: A Sound Event Detection Dataset For Generalizable Domain Adaptation
Shayan Gharib
Konstantinos Drossos
Eemi Fagerlund
Tuomas Virtanen
26
3
0
16 Nov 2019
OtoMechanic: Auditory Automobile Diagnostics via Query-by-Example
Max Morrison
Bryan Pardo
18
3
0
05 Nov 2019
Confident Learning: Estimating Uncertainty in Dataset Labels
Curtis G. Northcutt
Lu Jiang
Isaac L. Chuang
NoLa
149
692
0
31 Oct 2019
Model-agnostic Approaches to Handling Noisy Labels When Training Sound Event Classifiers
Eduardo Fonseca
F. Font
Xavier Serra
NoLa
69
9
0
26 Oct 2019
A Framework for the Robust Evaluation of Sound Event Detection
Cagdas Bilen
Giacomo Ferroni
Francesco Tuveri
Juan Azcarreta
Sacha Krstulović
75
163
0
18 Oct 2019
A hybrid parametric-deep learning approach for sound event localization and detection
A. Pérez-López
Eduardo Fonseca
Xavier Serra
115
6
0
27 Aug 2019
Natural Adversarial Examples
Dan Hendrycks
Kevin Zhao
Steven Basart
Jacob Steinhardt
Basel Alomair
OODD
212
1,472
0
16 Jul 2019
Audio tagging with noisy labels and minimal supervision
Eduardo Fonseca
Manoj Plakal
F. Font
D. Ellis
Xavier Serra
61
93
0
07 Jun 2019
A multi-room reverberant dataset for sound event localization and detection
Sharath Adavanne
Archontis Politis
Tuomas Virtanen
48
111
0
21 May 2019
Universal Sound Separation
Ilya Kavalerov
Scott Wisdom
Hakan Erdogan
Brian Patton
K. Wilson
Jonathan Le Roux
J. Hershey
44
187
0
08 May 2019
Do ImageNet Classifiers Generalize to ImageNet?
Benjamin Recht
Rebecca Roelofs
Ludwig Schmidt
Vaishaal Shankar
OOD
SSeg
VLM
113
1,715
0
13 Feb 2019
Do We Train on Test Data? Purging CIFAR of Near-Duplicates
Björn Barz
Joachim Denzler
67
97
0
01 Feb 2019
1
2
Next