Learning Filterbanks from Raw Speech for Phone Recognition

3 November 2017

Papers citing "Learning Filterbanks from Raw Speech for Phone Recognition"

23 / 23 papers shown

Title
ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration Daniel Haider Felix Perfler Péter Balázs Clara Hollomey Nicki Holighaus 46 0 0 12 May 2025
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling Jakob Poncelet Hugo Van hamme 83 0 0 05 Feb 2025
Audio Anti-Spoofing Detection: A Survey Menglu Li Yasaman Ahmadiadli Xiao-Ping Zhang 50 19 0 22 Apr 2024
Learnable Front Ends Based on Temporal Modulation for Music Tagging Yi Ma R. Stern 30 0 0 28 Nov 2022
Learning Temporal Resolution in Spectrogram for Audio Classification Haohe Liu Xubo Liu Qiuqiang Kong Wenwu Wang Mark D. Plumbley 34 7 0 04 Oct 2022
Low-Level Physiological Implications of End-to-End Learning of Speech Recognition Louise Coppieters de Gibson Philip N. Garner 26 1 0 22 Aug 2022
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use Jan Schluter Gerald Gutenbrunner VLM 39 12 0 12 Jul 2022
Learnable Nonlinear Compression for Robust Speaker Verification Xuechen Liu Md. Sahidullah Tomi Kinnunen 30 2 0 10 Feb 2022
A study of the robustness of raw waveform based speaker embeddings under mismatched conditions Ge Zhu Frank Cwitkowitz Z. Duan 22 2 0 08 Oct 2021
Learning Sparse Analytic Filters for Piano Transcription Frank Cwitkowitz M. Heydari Z. Duan 27 2 0 23 Aug 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild Okan Kopuklu Maja Taseska Gerhard Rigoll 3DV 29 45 0 07 Jun 2021
LEAF: A Learnable Frontend for Audio Classification Neil Zeghidour O. Teboul Félix de Chaumont Quitry Marco Tagliasacchi VLM AAML 85 144 0 21 Jan 2021
Lightweight End-to-End Speech Recognition from Raw Audio Data Using Sinc-Convolutions Ludwig Kurzinger Nicolas Lindae Palle Klewitz Gerhard Rigoll 27 5 0 15 Oct 2020
Optimization of data-driven filterbank for automatic speaker verification S. K. Sarangi Md. Sahidullah G. Saha 26 38 0 21 Jul 2020
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech Andy T. Liu Shang-Wen Li Hung-yi Lee SSL 62 356 0 12 Jul 2020
CGCNN: Complex Gabor Convolutional Neural Network on raw speech Paul-Gauthier Noé Titouan Parcollet Mohamed Morchid 22 29 0 11 Feb 2020
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends S. Latif R. Rana Sara Khalifa Raja Jurdak Junaid Qadir Björn W. Schuller AI4TS 34 81 0 02 Jan 2020
Effectiveness of self-supervised pre-training for speech recognition Alexei Baevski Michael Auli Abdel-rahman Mohamed SSL 27 147 0 10 Nov 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations Alexei Baevski Steffen Schneider Michael Auli SSL 22 660 0 12 Oct 2019
Universal Adversarial Audio Perturbations Sajjad Abdoli L. G. Hafemann Jérôme Rony Ismail Ben Ayed P. Cardinal Alessandro Lameiras Koerich AAML 25 51 0 08 Aug 2019
Learning to detect dysarthria from raw speech Juliette Millet Neil Zeghidour 32 41 0 27 Nov 2018
Deep Audio-Visual Speech Recognition Triantafyllos Afouras Joon Son Chung A. Senior Oriol Vinyals Andrew Zisserman 27 687 0 06 Sep 2018
End-to-End Speech Recognition From the Raw Waveform Neil Zeghidour Nicolas Usunier Gabriel Synnaeve R. Collobert Emmanuel Dupoux 27 84 0 19 Jun 2018