Fine-tuning wav2vec2 for speaker recognition

30 September 2021

Papers citing "Fine-tuning wav2vec2 for speaker recognition"

49 / 49 papers shown

Title
Speaker Fuzzy Fingerprints: Benchmarking Text-Based Identification in Multiparty Dialogues Rui Ribeiro Luísa Coheur Joao Paulo Carvalho 31 0 0 21 Apr 2025
Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of Transformers Aneesha Sampath James Tavernor E. Provost 48 0 0 17 Feb 2025
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques David Ortiz-Perez Manuel Benavent-Lledo José García Rodríguez David Tomás M. Flores Vizcaya-Moreno 34 0 0 24 Oct 2024
Universal Pooling Method of Multi-layer Features from Pretrained Models for Speaker Verification Jin Sob Kim Hyun Joon Park Wooseok Shin Sung Won Han SLR 50 0 0 12 Sep 2024
ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks Nakamasa Inoue Shinta Otake Takumi Hirose Masanari Ohi Rei Kawakami 36 1 0 28 Jul 2024
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection Yi Zhu Surya Koppisetti Trang Tran Gaurav Bharaj 52 9 0 26 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning Shuai Wang Zheng-Shou Chen Kong Aik Lee Yan-min Qian Haizhou Li 39 4 0 21 Jul 2024
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition Shujie Hu Xurong Xie Mengzhe Geng Zengrui Jin Jiajun Deng ... Yi Wang Mingyu Cui Tianzi Wang Helen Meng Xunying Liu 43 5 0 03 Jul 2024
Target Speech Extraction with Pre-trained Self-supervised Learning Models Junyi Peng Marc Delcroix Tsubasa Ochiai Oldrich Plchot Shoko Araki J. Černocký 39 8 0 17 Feb 2024
Probing Self-supervised Learning Models with Target Speech Extraction Junyi Peng Marc Delcroix Tsubasa Ochiai Oldrich Plchot Takanori Ashihara Shoko Araki J. Černocký 40 2 0 17 Feb 2024
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning Danwei Cai Zexin Cai Ming Li 25 0 0 03 Jan 2024
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation Huimeng Wang Zengrui Jin Mengzhe Geng Shujie Hu Guinan Li Tianzi Wang Haoning Xu Xunying Liu 19 10 0 01 Jan 2024
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference Dejan Porjazovski Yaroslav Getman Tamás Grósz M. Kurimo 30 3 0 16 Oct 2023
Wav2vec-based Detection and Severity Level Classification of Dysarthria from Speech Farhad Javanmardi Saska Tirronen Manila Kodali Sudarsana Reddy Kadiri P. Alku 8 28 0 25 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech Titouan Parcollet H. Nguyen Solène Evain Marcely Zanon Boito Adrien Pupier ... François Portet Solange Rossato F. Ringeval D. Schwab Laurent Besacier 40 15 0 11 Sep 2023
Fairness and Privacy in Voice Biometrics:A Study of Gender Influences Using wav2vec 2.0 Oubaïda Chouchane Michele Panariello Chiara Galdi Massimiliano Todisco Nicholas W. D. Evans 27 2 0 27 Aug 2023
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification Harunori Kawano Sota Shimizu 30 1 0 22 Aug 2023
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation Zirui Ge Xinzhou Xu Haiyan Guo Tingting Wang Zhen Yang SSL 19 1 0 09 Aug 2023
Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer Signals Sudarsana Reddy Kadiri Farhad Javanmardi P. Alku 24 6 0 06 Aug 2023
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies Yuya Yamamoto 25 2 0 22 Jun 2023
Unsupervised speech intelligibility assessment with utterance level alignment distance between teacher and learner Wav2Vec-2.0 representations Nayan Anand Meenakshi Sirigiraju Chiranjeevi Yarra 31 1 0 15 Jun 2023
Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models Danilo de Oliveira N. Prabhu Timo Gerkmann 23 5 0 30 May 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model Aoi Ito Shota Horiguchi SSL 21 2 0 24 May 2023
Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices Ahlam Husni Abu Nada S. Latif Junaid Qadir 20 4 0 22 Apr 2023
The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework Zirui Ge Haiyan Guo Zhen Yang 32 1 0 19 Mar 2023
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning Xiaobao Guo Nithish Muthuchamy Selvaraj Zitong Yu A. Kong Bingquan Shen Alex C. Kot 38 8 0 09 Mar 2023
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition Shujie Hu Xurong Xie Zengrui Jin Mengzhe Geng Yi Wang Mingyu Cui Jiajun Deng Xunying Liu Helen M. Meng 19 30 0 28 Feb 2023
Towards multi-task learning of speech and speaker recognition Nik Vaessen David A. van Leeuwen CVBM 16 0 0 24 Feb 2023
Speaker and Language Change Detection using Wav2vec2 and Whisper Tijn Berns Nik Vaessen David A. van Leeuwen 48 4 0 18 Feb 2023
Residual Information in Deep Speaker Embedding Architectures Adriana Stan 34 5 0 06 Feb 2023
Parameter Efficient Transfer Learning for Various Speech Processing Tasks Shinta Otake Rei Kawakami Nakamasa Inoue 24 16 0 06 Dec 2022
Multi-Label Training for Text-Independent Speaker Identification Yuqi Xue 24 0 0 14 Nov 2022
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models Ju-ho Kim Ju-Sung Heo Hyun-Seo Shin Chanmann Lim Ha-Jin Yu 18 5 0 04 Nov 2022
Dynamic Kernels and Channel Attention for Low Resource Speaker Verification A. Ollerenshaw Md. Asif Jalal Thomas Hain 19 0 0 03 Nov 2022
Application of Knowledge Distillation to Multi-task Speech Representation Learning Mine Kerpicci V. Nguyen Shuhua Zhang Erik M. Visser 30 0 0 29 Oct 2022
Universal speaker recognition encoders for different speech segments duration Sergey Novoselov V. Volokhov G. Lavrentyeva 4 2 0 28 Oct 2022
Fast Yet Effective Speech Emotion Recognition with Self-distillation Zhao Ren Thanh Tam Nguyen Yi Chang Björn W. Schuller 15 11 0 26 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation Evonne Lee Guangzhi Sun C. Zhang P. Woodland 21 1 0 24 Oct 2022
Large-scale learning of generalised representations for speaker recognition Jee-weon Jung Hee-Soo Heo Bong-Jin Lee Jaesong Lee Hye-jin Shim Youngki Kwon Joon Son Chung Shinji Watanabe CVBM 25 6 0 20 Oct 2022
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations Themos Stafylakis Ladislav Mošner Sofoklis Kakouros Oldrich Plchot L. Burget J. Černocký SSL 32 8 0 15 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition Dang-Khanh Nguyen Sudarshan Pant Ngoc-Huynh Ho Gueesang Lee Soo-Huyng Kim Hyung-Jeong Yang 22 3 0 01 Oct 2022
Speech Emotion: Investigating Model Representations, Multi-Task Learning and Knowledge Distillation Vikramjit Mitra H. Chien Vasudha Kowtha Joseph Y. Cheng Erdrin Azemi 17 7 0 02 Jul 2022
Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition Guangke Chen Zhe Zhao Fu Song Sen Chen Lingling Fan Feng Wang Jiashui Wang AAML 20 36 0 07 Jun 2022
Robust Speaker Recognition with Transformers Using wav2vec 2.0 Sergey Novoselov G. Lavrentyeva Anastasia Avdeeva V. Volokhov Aleksei Gusev ViT 13 18 0 28 Mar 2022
Training speaker recognition systems with limited data Nik Vaessen David A. van Leeuwen 11 6 0 28 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation Hemlata Tak Massimiliano Todisco Xin Wang Jee-weon Jung Junichi Yamagishi Nicholas W. D. Evans 34 151 0 24 Feb 2022
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding Yingzhi Wang Abdelmoumene Boumadane A. Heba 20 146 0 04 Nov 2021
Exploring wav2vec 2.0 on speaker verification and language identification Zhiyun Fan Meng Li Shiyu Zhou Bo Xu 117 202 0 11 Dec 2020
VoxCeleb2: Deep Speaker Recognition Joon Son Chung Arsha Nagrani Andrew Zisserman 230 2,233 0 14 Jun 2018