Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.10211
Cited By
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
21 December 2019
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLM
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition"
50 / 216 papers shown
Title
An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification
Zhi-Wei Zhong
M. Hirano
Kazuki Shimada
Kazuya Tateishi
Shusuke Takahashi
Yuki Mitsufuji
26
12
0
16 Feb 2023
Personalized Audio Quality Preference Prediction
Chung-Che Wang
Yu-Chun Lin
Yu-Teng Hsu
J. Jang
27
1
0
16 Feb 2023
Unsupervised classification to improve the quality of a bird song recording dataset
Félix Michaud
J. Sueur
Maxime LE Cesne
S. Haupert
29
28
0
15 Feb 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
26
15
0
19 Jan 2023
Training one model to detect heart and lung sound events from single point auscultations
Leander Melms
Robert R. Ilesan
Ulrich Köhler
O. Hildebrandt
R. Conradt
...
Jürgen R. Schaefer
Tobias Müller
J. Obergassel
Nadine Schlicker
M. Hirsch
31
2
0
15 Jan 2023
Improving trajectory localization accuracy via direction-of-arrival derivative estimation
Ruchi Pandey
Shreya Jaiswal
Huy P Phan
S. Nannuru
30
0
0
07 Dec 2022
Interpretability Analysis of Deep Models for COVID-19 Detection
Daniel Peixoto Pinto da Silva
Edresson Casanova
L. Gris
A. Júnior
Marcelo Finger
...
Beatriz Raposo
Marcus Martins
S. Aluísio
L. Berti
João Paulo Teixeira
23
3
0
25 Nov 2022
SpectNet : End-to-End Audio Signal Classification Using Learnable Spectrograms
Md. Istiaq Ansari
Taufiq Hasan
17
4
0
17 Nov 2022
Music Instrument Classification Reprogrammed
Hsin-Hung Chen
Alexander Lerch
24
4
0
15 Nov 2022
Describing emotions with acoustic property prompts for speech emotion recognition
Hira Dhamyal
Benjamin Elizalde
Soham Deshmukh
Huaming Wang
Bhiksha Raj
Rita Singh
26
10
0
14 Nov 2022
The Birds Need Attention Too: Analysing usage of Self Attention in identifying bird calls in soundscapes
Chandra Kanth Nagesh
Abhishek Purushothama
29
2
0
14 Nov 2022
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Yusong Wu
Kai Chen
Tianyu Zhang
Yuchen Hui
Marianna Nezhurina
Taylor Berg-Kirkpatrick
Shlomo Dubnov
CLIP
43
493
0
12 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
Florian Schmid
Khaled Koutini
Gerhard Widmer
ViT
28
58
0
09 Nov 2022
Introducing topography in convolutional neural networks
Maxime Poli
Emmanuel Dupoux
Rachid Riad
39
0
0
28 Oct 2022
Multi-dimensional Edge-based Audio Event Relational Graph Representation Learning for Acoustic Scene Classification
Yuanbo Hou
Siyang Song
Chuan Yu
Yuxin Song
Wenwu Wang
Dick Botteldooren
42
3
0
27 Oct 2022
Play It Back: Iterative Attention for Audio Recognition
Alexandros Stergiou
Dima Damen
42
4
0
20 Oct 2022
Propagating Variational Model Uncertainty for Bioacoustic Call Label Smoothing
Georgios Rizos
J. Lawson
Simon Mitchell
Pranay Shah
Xin Wen
Cristina Banks‐Leite
R. Ewers
Bjoern W. Schuller
UQCV
23
2
0
19 Oct 2022
Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context
L. D. Pham
Dusan Salovic
Anahid N. Jalali
Alexander Schindler
Khoa Tran
H. Vu
Phu X. Nguyen
35
5
0
16 Oct 2022
Cross-dataset COVID-19 Transfer Learning with Cough Detection, Cough Segmentation, and Data Augmentation
Bagus Tris Atmaja
Zanjabila
Suyanto
A. Sasou
34
1
0
12 Oct 2022
Supervised and Unsupervised Learning of Audio Representations for Music Understanding
Matthew C. McCallum
Filip Korzeniowski
Sergio Oramas
F. Gouyon
Andreas F. Ehmann
SSL
80
37
0
07 Oct 2022
Learning Temporal Resolution in Spectrogram for Audio Classification
Haohe Liu
Xubo Liu
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
39
7
0
04 Oct 2022
Contrastive Audio-Visual Masked Autoencoder
Yuan Gong
Andrew Rouditchenko
Alexander H. Liu
David Harwath
Leonid Karlinsky
Hilde Kuehne
James R. Glass
45
120
0
02 Oct 2022
An empirical study of weakly supervised audio tagging embeddings for general audio representations
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
43
1
0
30 Sep 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
47
2
0
28 Sep 2022
The Efficacy of Self-Supervised Speech Models for Audio Representations
Tung-Yu Wu
Chen-An Li
Tzu-Han Lin
Tsung-Yuan Hsu
Hung-yi Lee
37
5
0
26 Sep 2022
UniKW-AT: Unified Keyword Spotting and Audio Tagging
Heinrich Dinkel
Yongqing Wang
Zhiyong Yan
Junbo Zhang
Yujun Wang
47
3
0
23 Sep 2022
Language-based Audio Retrieval Task in DCASE 2022 Challenge
Huang Xie
Samuel Lipping
Tuomas Virtanen
79
18
0
20 Sep 2022
Improving Natural-Language-based Audio Retrieval with Transfer Learning and Audio & Text Augmentations
Paul Primus
Gerhard Widmer
29
6
0
24 Aug 2022
Improved Zero-Shot Audio Tagging & Classification with Patchout Spectrogram Transformers
Paul Primus
Gerhard Widmer
VLM
27
5
0
24 Aug 2022
Fall Detection from Audios with Audio Transformers
Prabhjot Kaur
Qifan Wang
Weisong Shi
29
16
0
23 Aug 2022
Pathway to Future Symbiotic Creativity
Yi-Ting Guo
Qi-fei Liu
Jie Chen
Wei Xue
Jie Fu
...
Fernando Rosas
Jeffrey Shaw
Xing Wu
Jiji Zhang
Jianliang Xu
39
0
0
18 Aug 2022
An investigation on selecting audio pre-trained models for audio captioning
Peiran Yan
Sheng-Wei Li
26
0
0
12 Aug 2022
Seeing your sleep stage: cross-modal distillation from EEG to infrared video
Jianan Han
Shenmin Zhang
Aidong Men
Yang Liu
Z. Yao
Yan-Tao Yan
Qingchao Chen
33
4
0
11 Aug 2022
Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis
Jia Li
Ziyang Zhang
Jun Lang
Yueqi Jiang
Liuwei An
...
Sheng Gao
Jie Lin
Chunxiao Fan
Xiao Sun
Meng Wang
59
30
0
05 Aug 2022
GAFX: A General Audio Feature eXtractor
Zhaoyang Bu
Han Zhang
Xiaohu Zhu
30
0
0
19 Jul 2022
Segment-level Metric Learning for Few-shot Bioacoustic Event Detection
Haohe Liu
Xubo Liu
Xinhao Mei
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
33
8
0
15 Jul 2022
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
28
270
0
13 Jul 2022
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use
Jan Schluter
Gerald Gutenbrunner
VLM
39
12
0
12 Jul 2022
Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss
Andrew Koh
Chng Eng Siong
32
1
0
29 Jun 2022
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer
Jinmiao Huang
W. Gharbieh
Qianhui Wan
Han Suk Shim
Chul Lee
22
9
0
23 Jun 2022
Redundancy Reduction Twins Network: A Training framework for Multi-output Emotion Regression
Xin Jing
Meishu Song
Andreas Triantafyllopoulos
Zijiang Yang
Björn W. Schuller
21
8
0
18 Jun 2022
Exploring speaker enrolment for few-shot personalisation in emotional vocalisation prediction
Andreas Triantafyllopoulos
Meishu Song
Zijiang Yang
Xin Jing
Björn W. Schuller
27
8
0
14 Jun 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
Hui Zhang
Tian Yuan
Junkun Chen
Xintong Li
Renjie Zheng
...
Zeyu Chen
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
Liang Huang
AuLLM
41
24
0
20 May 2022
Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
30
6
0
17 May 2022
Learning Representations for New Sound Classes With Continual Self-Supervised Learning
Zhepei Wang
Cem Subakan
Xilin Jiang
Junkai Wu
Efthymios Tzinis
Mirco Ravanelli
Paris Smaragdis
CLL
SSL
72
19
0
15 May 2022
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
34
38
0
12 May 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning
Xuenan Xu
Zeyu Xie
Mengyue Wu
K. Yu
50
13
0
11 May 2022
Fatigue Prediction in Outdoor Running Conditions using Audio Data
Andreas Triantafyllopoulos
Sandra Ottl
Alexander Gebhard
Esther Rituerto-González
Mirko Jaumann
...
P. Schneeweiss
I. Krauss
Maurice Gerczuk
Shahin Amiriparian
Björn W. Schuller
40
9
0
09 May 2022
Relation-guided acoustic scene classification aided with event embeddings
Yuanbo Hou
Bo Kang
Wout Van Hauwermeiren
Dick Botteldooren
24
16
0
01 May 2022
Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training
Dading Chong
Helin Wang
Peilin Zhou
Qingcheng Zeng
41
65
0
27 Apr 2022
Previous
1
2
3
4
5
Next