PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

21 December 2019

Yuxuan Wang

Papers citing "PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition"

50 / 216 papers shown

Title
An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification Zhi-Wei Zhong M. Hirano Kazuki Shimada Kazuya Tateishi Shusuke Takahashi Yuki Mitsufuji 26 12 0 16 Feb 2023
Personalized Audio Quality Preference Prediction Chung-Che Wang Yu-Chun Lin Yu-Teng Hsu J. Jang 27 1 0 16 Feb 2023
Unsupervised classification to improve the quality of a bird song recording dataset Félix Michaud J. Sueur Maxime LE Cesne S. Haupert 29 28 0 15 Feb 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection Xizi Wang Feng Cheng Gedas Bertasius David J. Crandall 26 15 0 19 Jan 2023
Training one model to detect heart and lung sound events from single point auscultations Leander Melms Robert R. Ilesan Ulrich Köhler O. Hildebrandt R. Conradt ... Jürgen R. Schaefer Tobias Müller J. Obergassel Nadine Schlicker M. Hirsch 31 2 0 15 Jan 2023
Improving trajectory localization accuracy via direction-of-arrival derivative estimation Ruchi Pandey Shreya Jaiswal Huy P Phan S. Nannuru 30 0 0 07 Dec 2022
Interpretability Analysis of Deep Models for COVID-19 Detection Daniel Peixoto Pinto da Silva Edresson Casanova L. Gris A. Júnior Marcelo Finger ... Beatriz Raposo Marcus Martins S. Aluísio L. Berti João Paulo Teixeira 23 3 0 25 Nov 2022
SpectNet : End-to-End Audio Signal Classification Using Learnable Spectrograms Md. Istiaq Ansari Taufiq Hasan 17 4 0 17 Nov 2022
Music Instrument Classification Reprogrammed Hsin-Hung Chen Alexander Lerch 24 4 0 15 Nov 2022
Describing emotions with acoustic property prompts for speech emotion recognition Hira Dhamyal Benjamin Elizalde Soham Deshmukh Huaming Wang Bhiksha Raj Rita Singh 26 10 0 14 Nov 2022
The Birds Need Attention Too: Analysing usage of Self Attention in identifying bird calls in soundscapes Chandra Kanth Nagesh Abhishek Purushothama 29 2 0 14 Nov 2022
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation Yusong Wu Kai Chen Tianyu Zhang Yuchen Hui Marianna Nezhurina Taylor Berg-Kirkpatrick Shlomo Dubnov CLIP 43 493 0 12 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation Florian Schmid Khaled Koutini Gerhard Widmer ViT 28 58 0 09 Nov 2022
Introducing topography in convolutional neural networks Maxime Poli Emmanuel Dupoux Rachid Riad 39 0 0 28 Oct 2022
Multi-dimensional Edge-based Audio Event Relational Graph Representation Learning for Acoustic Scene Classification Yuanbo Hou Siyang Song Chuan Yu Yuxin Song Wenwu Wang Dick Botteldooren 42 3 0 27 Oct 2022
Play It Back: Iterative Attention for Audio Recognition Alexandros Stergiou Dima Damen 42 4 0 20 Oct 2022
Propagating Variational Model Uncertainty for Bioacoustic Call Label Smoothing Georgios Rizos J. Lawson Simon Mitchell Pranay Shah Xin Wen Cristina Banks‐Leite R. Ewers Bjoern W. Schuller UQCV 23 2 0 19 Oct 2022
Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context L. D. Pham Dusan Salovic Anahid N. Jalali Alexander Schindler Khoa Tran H. Vu Phu X. Nguyen 35 5 0 16 Oct 2022
Cross-dataset COVID-19 Transfer Learning with Cough Detection, Cough Segmentation, and Data Augmentation Bagus Tris Atmaja Zanjabila Suyanto A. Sasou 34 1 0 12 Oct 2022
Supervised and Unsupervised Learning of Audio Representations for Music Understanding Matthew C. McCallum Filip Korzeniowski Sergio Oramas F. Gouyon Andreas F. Ehmann SSL 80 37 0 07 Oct 2022
Learning Temporal Resolution in Spectrogram for Audio Classification Haohe Liu Xubo Liu Qiuqiang Kong Wenwu Wang Mark D. Plumbley 39 7 0 04 Oct 2022
Contrastive Audio-Visual Masked Autoencoder Yuan Gong Andrew Rouditchenko Alexander H. Liu David Harwath Leonid Karlinsky Hilde Kuehne James R. Glass 45 120 0 02 Oct 2022
An empirical study of weakly supervised audio tagging embeddings for general audio representations Heinrich Dinkel Zhiyong Yan Yongqing Wang Junbo Zhang Yujun Wang 43 1 0 30 Sep 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks Andrés Vasco-Carofilis Laura Fernández-Robles Enrique Alegre Eduardo FIDALGO 47 2 0 28 Sep 2022
The Efficacy of Self-Supervised Speech Models for Audio Representations Tung-Yu Wu Chen-An Li Tzu-Han Lin Tsung-Yuan Hsu Hung-yi Lee 37 5 0 26 Sep 2022
UniKW-AT: Unified Keyword Spotting and Audio Tagging Heinrich Dinkel Yongqing Wang Zhiyong Yan Junbo Zhang Yujun Wang 47 3 0 23 Sep 2022
Language-based Audio Retrieval Task in DCASE 2022 Challenge Huang Xie Samuel Lipping Tuomas Virtanen 79 18 0 20 Sep 2022
Improving Natural-Language-based Audio Retrieval with Transfer Learning and Audio & Text Augmentations Paul Primus Gerhard Widmer 29 6 0 24 Aug 2022
Improved Zero-Shot Audio Tagging & Classification with Patchout Spectrogram Transformers Paul Primus Gerhard Widmer VLM 27 5 0 24 Aug 2022
Fall Detection from Audios with Audio Transformers Prabhjot Kaur Qifan Wang Weisong Shi 29 16 0 23 Aug 2022
Pathway to Future Symbiotic Creativity Yi-Ting Guo Qi-fei Liu Jie Chen Wei Xue Jie Fu ... Fernando Rosas Jeffrey Shaw Xing Wu Jiji Zhang Jianliang Xu 39 0 0 18 Aug 2022
An investigation on selecting audio pre-trained models for audio captioning Peiran Yan Sheng-Wei Li 26 0 0 12 Aug 2022
Seeing your sleep stage: cross-modal distillation from EEG to infrared video Jianan Han Shenmin Zhang Aidong Men Yang Liu Z. Yao Yan-Tao Yan Qingchao Chen 33 4 0 11 Aug 2022
Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis Jia Li Ziyang Zhang Jun Lang Yueqi Jiang Liuwei An ... Sheng Gao Jie Lin Chunxiao Fan Xiao Sun Meng Wang 59 30 0 05 Aug 2022
GAFX: A General Audio Feature eXtractor Zhaoyang Bu Han Zhang Xiaohu Zhu 30 0 0 19 Jul 2022
Segment-level Metric Learning for Few-shot Bioacoustic Event Detection Haohe Liu Xubo Liu Xinhao Mei Qiuqiang Kong Wenwu Wang Mark D. Plumbley 33 8 0 15 Jul 2022
Masked Autoencoders that Listen Po-Yao (Bernie) Huang Hu Xu Juncheng Billy Li Alexei Baevski Michael Auli Wojciech Galuba Florian Metze Christoph Feichtenhofer 28 270 0 13 Jul 2022
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use Jan Schluter Gerald Gutenbrunner VLM 39 12 0 12 Jul 2022
Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss Andrew Koh Chng Eng Siong 32 1 0 29 Jun 2022
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer Jinmiao Huang W. Gharbieh Qianhui Wan Han Suk Shim Chul Lee 22 9 0 23 Jun 2022
Redundancy Reduction Twins Network: A Training framework for Multi-output Emotion Regression Xin Jing Meishu Song Andreas Triantafyllopoulos Zijiang Yang Björn W. Schuller 21 8 0 18 Jun 2022
Exploring speaker enrolment for few-shot personalisation in emotional vocalisation prediction Andreas Triantafyllopoulos Meishu Song Zijiang Yang Xin Jing Björn W. Schuller 27 8 0 14 Jun 2022
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit Hui Zhang Tian Yuan Junkun Chen Xintong Li Renjie Zheng ... Zeyu Chen Xiaoguang Hu Dianhai Yu Yanjun Ma Liang Huang AuLLM 41 24 0 20 May 2022
Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model Daisuke Niizumi Daiki Takeuchi Yasunori Ohishi Noboru Harada K. Kashino 30 6 0 17 May 2022
Learning Representations for New Sound Classes With Continual Self-Supervised Learning Zhepei Wang Cem Subakan Xilin Jiang Junkai Wu Efthymios Tzinis Mirco Ravanelli Paris Smaragdis CLL SSL 72 19 0 15 May 2022
Automated Audio Captioning: An Overview of Recent Progress and New Challenges Xinhao Mei Xubo Liu Mark D. Plumbley Wenwu Wang 34 38 0 12 May 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning Xuenan Xu Zeyu Xie Mengyue Wu K. Yu 50 13 0 11 May 2022
Fatigue Prediction in Outdoor Running Conditions using Audio Data Andreas Triantafyllopoulos Sandra Ottl Alexander Gebhard Esther Rituerto-González Mirko Jaumann ... P. Schneeweiss I. Krauss Maurice Gerczuk Shahin Amiriparian Björn W. Schuller 40 9 0 09 May 2022
Relation-guided acoustic scene classification aided with event embeddings Yuanbo Hou Bo Kang Wout Van Hauwermeiren Dick Botteldooren 24 16 0 01 May 2022
Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training Dading Chong Helin Wang Peilin Zhou Qingcheng Zeng 41 65 0 27 Apr 2022