Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.15053
Cited By
Fine-tuning wav2vec2 for speaker recognition
30 September 2021
Nik Vaessen
David A. van Leeuwen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fine-tuning wav2vec2 for speaker recognition"
49 / 49 papers shown
Title
Speaker Fuzzy Fingerprints: Benchmarking Text-Based Identification in Multiparty Dialogues
Rui Ribeiro
Luísa Coheur
Joao Paulo Carvalho
31
0
0
21 Apr 2025
Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of Transformers
Aneesha Sampath
James Tavernor
E. Provost
51
0
0
17 Feb 2025
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques
David Ortiz-Perez
Manuel Benavent-Lledo
José García Rodríguez
David Tomás
M. Flores Vizcaya-Moreno
34
0
0
24 Oct 2024
Universal Pooling Method of Multi-layer Features from Pretrained Models for Speaker Verification
Jin Sob Kim
Hyun Joon Park
Wooseok Shin
Sung Won Han
SLR
50
0
0
12 Sep 2024
ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks
Nakamasa Inoue
Shinta Otake
Takumi Hirose
Masanari Ohi
Rei Kawakami
39
1
0
28 Jul 2024
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
Yi Zhu
Surya Koppisetti
Trang Tran
Gaurav Bharaj
52
9
0
26 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
39
4
0
21 Jul 2024
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
Shujie Hu
Xurong Xie
Mengzhe Geng
Zengrui Jin
Jiajun Deng
...
Yi Wang
Mingyu Cui
Tianzi Wang
Helen Meng
Xunying Liu
51
5
0
03 Jul 2024
Target Speech Extraction with Pre-trained Self-supervised Learning Models
Junyi Peng
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
42
8
0
17 Feb 2024
Probing Self-supervised Learning Models with Target Speech Extraction
Junyi Peng
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Takanori Ashihara
Shoko Araki
J. Černocký
40
2
0
17 Feb 2024
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
Danwei Cai
Zexin Cai
Ming Li
25
0
0
03 Jan 2024
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
Huimeng Wang
Zengrui Jin
Mengzhe Geng
Shujie Hu
Guinan Li
Tianzi Wang
Haoning Xu
Xunying Liu
19
10
0
01 Jan 2024
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference
Dejan Porjazovski
Yaroslav Getman
Tamás Grósz
M. Kurimo
30
3
0
16 Oct 2023
Wav2vec-based Detection and Severity Level Classification of Dysarthria from Speech
Farhad Javanmardi
Saska Tirronen
Manila Kodali
Sudarsana Reddy Kadiri
P. Alku
13
28
0
25 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
F. Ringeval
D. Schwab
Laurent Besacier
40
15
0
11 Sep 2023
Fairness and Privacy in Voice Biometrics:A Study of Gender Influences Using wav2vec 2.0
Oubaïda Chouchane
Michele Panariello
Chiara Galdi
Massimiliano Todisco
Nicholas W. D. Evans
27
2
0
27 Aug 2023
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification
Harunori Kawano
Sota Shimizu
30
1
0
22 Aug 2023
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation
Zirui Ge
Xinzhou Xu
Haiyan Guo
Tingting Wang
Zhen Yang
SSL
19
1
0
09 Aug 2023
Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer Signals
Sudarsana Reddy Kadiri
Farhad Javanmardi
P. Alku
30
6
0
06 Aug 2023
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies
Yuya Yamamoto
30
2
0
22 Jun 2023
Unsupervised speech intelligibility assessment with utterance level alignment distance between teacher and learner Wav2Vec-2.0 representations
Nayan Anand
Meenakshi Sirigiraju
Chiranjeevi Yarra
31
1
0
15 Jun 2023
Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Danilo de Oliveira
N. Prabhu
Timo Gerkmann
25
5
0
30 May 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Aoi Ito
Shota Horiguchi
SSL
27
2
0
24 May 2023
Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices
Ahlam Husni Abu Nada
S. Latif
Junaid Qadir
25
4
0
22 Apr 2023
The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework
Zirui Ge
Haiyan Guo
Zhen Yang
32
1
0
19 Mar 2023
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning
Xiaobao Guo
Nithish Muthuchamy Selvaraj
Zitong Yu
A. Kong
Bingquan Shen
Alex C. Kot
38
8
0
09 Mar 2023
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition
Shujie Hu
Xurong Xie
Zengrui Jin
Mengzhe Geng
Yi Wang
Mingyu Cui
Jiajun Deng
Xunying Liu
Helen M. Meng
19
30
0
28 Feb 2023
Towards multi-task learning of speech and speaker recognition
Nik Vaessen
David A. van Leeuwen
CVBM
22
0
0
24 Feb 2023
Speaker and Language Change Detection using Wav2vec2 and Whisper
Tijn Berns
Nik Vaessen
David A. van Leeuwen
53
4
0
18 Feb 2023
Residual Information in Deep Speaker Embedding Architectures
Adriana Stan
34
5
0
06 Feb 2023
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
24
16
0
06 Dec 2022
Multi-Label Training for Text-Independent Speaker Identification
Yuqi Xue
27
0
0
14 Nov 2022
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
Ju-ho Kim
Ju-Sung Heo
Hyun-Seo Shin
Chanmann Lim
Ha-Jin Yu
20
5
0
04 Nov 2022
Dynamic Kernels and Channel Attention for Low Resource Speaker Verification
A. Ollerenshaw
Md. Asif Jalal
Thomas Hain
19
0
0
03 Nov 2022
Application of Knowledge Distillation to Multi-task Speech Representation Learning
Mine Kerpicci
V. Nguyen
Shuhua Zhang
Erik M. Visser
33
0
0
29 Oct 2022
Universal speaker recognition encoders for different speech segments duration
Sergey Novoselov
V. Volokhov
G. Lavrentyeva
12
2
0
28 Oct 2022
Fast Yet Effective Speech Emotion Recognition with Self-distillation
Zhao Ren
Thanh Tam Nguyen
Yi Chang
Björn W. Schuller
20
11
0
26 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Evonne Lee
Guangzhi Sun
C. Zhang
P. Woodland
27
1
0
24 Oct 2022
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
31
6
0
20 Oct 2022
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Themos Stafylakis
Ladislav Mošner
Sofoklis Kakouros
Oldrich Plchot
L. Burget
J. Černocký
SSL
37
8
0
15 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
22
3
0
01 Oct 2022
Speech Emotion: Investigating Model Representations, Multi-Task Learning and Knowledge Distillation
Vikramjit Mitra
H. Chien
Vasudha Kowtha
Joseph Y. Cheng
Erdrin Azemi
22
7
0
02 Jul 2022
Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Feng Wang
Jiashui Wang
AAML
20
36
0
07 Jun 2022
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Sergey Novoselov
G. Lavrentyeva
Anastasia Avdeeva
V. Volokhov
Aleksei Gusev
ViT
13
18
0
28 Mar 2022
Training speaker recognition systems with limited data
Nik Vaessen
David A. van Leeuwen
16
6
0
28 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
34
151
0
24 Feb 2022
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding
Yingzhi Wang
Abdelmoumene Boumadane
A. Heba
26
146
0
04 Nov 2021
Exploring wav2vec 2.0 on speaker verification and language identification
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
117
202
0
11 Dec 2020
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
251
2,233
0
14 Jun 2018
1