Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.15053
Cited By
Fine-tuning wav2vec2 for speaker recognition
30 September 2021
Nik Vaessen
David A. van Leeuwen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fine-tuning wav2vec2 for speaker recognition"
49 / 49 papers shown
Title
Speaker Fuzzy Fingerprints: Benchmarking Text-Based Identification in Multiparty Dialogues
Rui Ribeiro
Luísa Coheur
Joao Paulo Carvalho
31
0
0
21 Apr 2025
Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of Transformers
Aneesha Sampath
James Tavernor
E. Provost
48
0
0
17 Feb 2025
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques
David Ortiz-Perez
Manuel Benavent-Lledo
José García Rodríguez
David Tomás
M. Flores Vizcaya-Moreno
34
0
0
24 Oct 2024
Universal Pooling Method of Multi-layer Features from Pretrained Models for Speaker Verification
Jin Sob Kim
Hyun Joon Park
Wooseok Shin
Sung Won Han
SLR
50
0
0
12 Sep 2024
ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks
Nakamasa Inoue
Shinta Otake
Takumi Hirose
Masanari Ohi
Rei Kawakami
36
1
0
28 Jul 2024
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
Yi Zhu
Surya Koppisetti
Trang Tran
Gaurav Bharaj
52
9
0
26 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
39
4
0
21 Jul 2024
Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
Shujie Hu
Xurong Xie
Mengzhe Geng
Zengrui Jin
Jiajun Deng
...
Yi Wang
Mingyu Cui
Tianzi Wang
Helen Meng
Xunying Liu
43
5
0
03 Jul 2024
Target Speech Extraction with Pre-trained Self-supervised Learning Models
Junyi Peng
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Shoko Araki
J. Černocký
39
8
0
17 Feb 2024
Probing Self-supervised Learning Models with Target Speech Extraction
Junyi Peng
Marc Delcroix
Tsubasa Ochiai
Oldrich Plchot
Takanori Ashihara
Shoko Araki
J. Černocký
40
2
0
17 Feb 2024
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
Danwei Cai
Zexin Cai
Ming Li
25
0
0
03 Jan 2024
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
Huimeng Wang
Zengrui Jin
Mengzhe Geng
Shujie Hu
Guinan Li
Tianzi Wang
Haoning Xu
Xunying Liu
19
10
0
01 Jan 2024
Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference
Dejan Porjazovski
Yaroslav Getman
Tamás Grósz
M. Kurimo
30
3
0
16 Oct 2023
Wav2vec-based Detection and Severity Level Classification of Dysarthria from Speech
Farhad Javanmardi
Saska Tirronen
Manila Kodali
Sudarsana Reddy Kadiri
P. Alku
8
28
0
25 Sep 2023
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Titouan Parcollet
H. Nguyen
Solène Evain
Marcely Zanon Boito
Adrien Pupier
...
François Portet
Solange Rossato
F. Ringeval
D. Schwab
Laurent Besacier
40
15
0
11 Sep 2023
Fairness and Privacy in Voice Biometrics:A Study of Gender Influences Using wav2vec 2.0
Oubaïda Chouchane
Michele Panariello
Chiara Galdi
Massimiliano Todisco
Nicholas W. D. Evans
27
2
0
27 Aug 2023
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification
Harunori Kawano
Sota Shimizu
30
1
0
22 Aug 2023
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation
Zirui Ge
Xinzhou Xu
Haiyan Guo
Tingting Wang
Zhen Yang
SSL
19
1
0
09 Aug 2023
Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer Signals
Sudarsana Reddy Kadiri
Farhad Javanmardi
P. Alku
24
6
0
06 Aug 2023
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies
Yuya Yamamoto
25
2
0
22 Jun 2023
Unsupervised speech intelligibility assessment with utterance level alignment distance between teacher and learner Wav2Vec-2.0 representations
Nayan Anand
Meenakshi Sirigiraju
Chiranjeevi Yarra
31
1
0
15 Jun 2023
Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Danilo de Oliveira
N. Prabhu
Timo Gerkmann
23
5
0
30 May 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Aoi Ito
Shota Horiguchi
SSL
21
2
0
24 May 2023
Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices
Ahlam Husni Abu Nada
S. Latif
Junaid Qadir
20
4
0
22 Apr 2023
The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework
Zirui Ge
Haiyan Guo
Zhen Yang
32
1
0
19 Mar 2023
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning
Xiaobao Guo
Nithish Muthuchamy Selvaraj
Zitong Yu
A. Kong
Bingquan Shen
Alex C. Kot
38
8
0
09 Mar 2023
Exploring Self-supervised Pre-trained ASR Models For Dysarthric and Elderly Speech Recognition
Shujie Hu
Xurong Xie
Zengrui Jin
Mengzhe Geng
Yi Wang
Mingyu Cui
Jiajun Deng
Xunying Liu
Helen M. Meng
19
30
0
28 Feb 2023
Towards multi-task learning of speech and speaker recognition
Nik Vaessen
David A. van Leeuwen
CVBM
16
0
0
24 Feb 2023
Speaker and Language Change Detection using Wav2vec2 and Whisper
Tijn Berns
Nik Vaessen
David A. van Leeuwen
48
4
0
18 Feb 2023
Residual Information in Deep Speaker Embedding Architectures
Adriana Stan
34
5
0
06 Feb 2023
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
24
16
0
06 Dec 2022
Multi-Label Training for Text-Independent Speaker Identification
Yuqi Xue
24
0
0
14 Nov 2022
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
Ju-ho Kim
Ju-Sung Heo
Hyun-Seo Shin
Chanmann Lim
Ha-Jin Yu
18
5
0
04 Nov 2022
Dynamic Kernels and Channel Attention for Low Resource Speaker Verification
A. Ollerenshaw
Md. Asif Jalal
Thomas Hain
19
0
0
03 Nov 2022
Application of Knowledge Distillation to Multi-task Speech Representation Learning
Mine Kerpicci
V. Nguyen
Shuhua Zhang
Erik M. Visser
30
0
0
29 Oct 2022
Universal speaker recognition encoders for different speech segments duration
Sergey Novoselov
V. Volokhov
G. Lavrentyeva
4
2
0
28 Oct 2022
Fast Yet Effective Speech Emotion Recognition with Self-distillation
Zhao Ren
Thanh Tam Nguyen
Yi Chang
Björn W. Schuller
15
11
0
26 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Evonne Lee
Guangzhi Sun
C. Zhang
P. Woodland
21
1
0
24 Oct 2022
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
25
6
0
20 Oct 2022
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Themos Stafylakis
Ladislav Mošner
Sofoklis Kakouros
Oldrich Plchot
L. Burget
J. Černocký
SSL
32
8
0
15 Oct 2022
Fine-tuning Wav2vec for Vocal-burst Emotion Recognition
Dang-Khanh Nguyen
Sudarshan Pant
Ngoc-Huynh Ho
Gueesang Lee
Soo-Huyng Kim
Hyung-Jeong Yang
22
3
0
01 Oct 2022
Speech Emotion: Investigating Model Representations, Multi-Task Learning and Knowledge Distillation
Vikramjit Mitra
H. Chien
Vasudha Kowtha
Joseph Y. Cheng
Erdrin Azemi
17
7
0
02 Jul 2022
Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Feng Wang
Jiashui Wang
AAML
20
36
0
07 Jun 2022
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Sergey Novoselov
G. Lavrentyeva
Anastasia Avdeeva
V. Volokhov
Aleksei Gusev
ViT
13
18
0
28 Mar 2022
Training speaker recognition systems with limited data
Nik Vaessen
David A. van Leeuwen
11
6
0
28 Mar 2022
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
34
151
0
24 Feb 2022
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding
Yingzhi Wang
Abdelmoumene Boumadane
A. Heba
20
146
0
04 Nov 2021
Exploring wav2vec 2.0 on speaker verification and language identification
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
117
202
0
11 Dec 2020
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
230
2,233
0
14 Jun 2018
1