Exploring wav2vec 2.0 on speaker verification and language identification

11 December 2020

Bo Xu

Papers citing "Exploring wav2vec 2.0 on speaker verification and language identification"

50 / 102 papers shown

Title
Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning Abdulhady Abas Abdullah S. H. Karim Sara Azad Ahmed Kanar R. Tariq Tarik Ahmed Rashid 153 0 0 23 Apr 2025
Respiratory Inhaler Sound Event Classification Using Self-Supervised Learning Davoud Shariat Panah Alessandro N Franciosi Cormac McCarthy Andrew Hines 21 0 0 15 Apr 2025
Exploring Modality Disruption in Multimodal Fake News Detection Moyang Liu Kaiying Yan Yukun Liu Ruibo Fu Zhengqi Wen Xuefei Liu Chenxing Li 24 0 0 12 Apr 2025
From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech Ji-Hoon Kim Jeongsoo Choi Jaehun Kim Chaeyoung Jung Joon Son Chung CVBM 50 1 0 21 Mar 2025
A Dual-Stage Time-Context Network for Speech-Based Alzheimer's Disease Detection Yifan Gao Long Guo Hong Liu 93 0 0 18 Feb 2025
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation Ji-Hoon Kim Hong-Sun Yang Yoon-Cheol Ju Il-Hwan Kim Byeong-Yeol Kim Joon Son Chung BDL 51 0 0 31 Dec 2024
LLM-Ref: Enhancing Reference Handling in Technical Writing with Large Language Models Kazi Ahmed Asif Fuad Lizhong Chen 26 0 0 01 Nov 2024
Do Discrete Self-Supervised Representations of Speech Capture Tone Distinctions? Opeyemi Osakuade Simon King 34 0 0 25 Oct 2024
Universal Pooling Method of Multi-layer Features from Pretrained Models for Speaker Verification Jin Sob Kim Hyun Joon Park Wooseok Shin Sung Won Han SLR 50 0 0 12 Sep 2024
ELP-Adapters: Parameter Efficient Adapter Tuning for Various Speech Processing Tasks Nakamasa Inoue Shinta Otake Takumi Hirose Masanari Ohi Rei Kawakami 34 1 0 28 Jul 2024
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection Yi Zhu Surya Koppisetti Trang Tran Gaurav Bharaj 52 9 0 26 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning Shuai Wang Zheng-Shou Chen Kong Aik Lee Yan-min Qian Haizhou Li 39 4 0 21 Jul 2024
Universal Sound Separation with Self-Supervised Audio Masked Autoencoder Junqi Zhao Xubo Liu Jinzheng Zhao Yiitan Yuan Qiuqiang Kong Mark D. Plumbley Wenwu Wang 25 3 0 16 Jul 2024
A Layer-Anchoring Strategy for Enhancing Cross-Lingual Speech Emotion Recognition Shreya G. Upadhyay Carlos Busso Chi-Chun Lee 40 3 0 06 Jul 2024
Speech Representation Analysis based on Inter- and Intra-Model Similarities Yassine El Kheir Ahmed M. Ali Shammur A. Chowdhury SSL 43 2 0 23 Jun 2024
Articulatory Encodec: Coding Speech through Vocal Tract Kinematics Cheol Jun Cho Peter Wu Tejas S. Prabhune Dhruv Agarwal Gopala K. Anumanchipalli 36 1 0 18 Jun 2024
Attentive Merging of Hidden Embeddings from Pre-trained Speech Model for Anti-spoofing Detection Zihan Pan Tianchi Liu Hardik B. Sailor Qiongqiong Wang 45 10 0 12 Jun 2024
Towards Supervised Performance on Speaker Verification with Self-Supervised Learning by Leveraging Large-Scale ASR Models Victor Miara Theo Lepage Reda Dehak 29 1 0 04 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models Shu-Wen Yang Heng-Jui Chang Zili Huang Andy T. Liu Cheng-I Jeff Lai ... Kushal Lakhotia Shang-Wen Li Abdelrahman Mohamed Shinji Watanabe Hung-yi Lee 38 19 0 15 Apr 2024
SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning Luca Zampierin G. B. Hacene Bac Nguyen Mirco Ravanelli 38 2 0 26 Feb 2024
Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features? Zakaria Aldeneh Takuya Higuchi Jee-weon Jung Skyler Seto Tatiana Likhomanenko Stephen Shum Ahmed Hussen Abdelaziz Shinji Watanabe B. Theobald SSL 34 2 0 01 Feb 2024
Singer Identity Representation Learning using Self-Supervised Techniques Bernardo Torres Stefan Lattner Gaël Richard SSL 35 8 0 10 Jan 2024
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning Danwei Cai Zexin Cai Ming Li 25 0 0 03 Jan 2024
Generative linguistic representation for spoken language identification Peng Shen Xuguang Lu Hisashi Kawai 14 0 0 18 Dec 2023
On the Behavior of Audio-Visual Fusion Architectures in Identity Verification Tasks Daniel Claborne Eric Slyman Karl Pazdernik 12 0 0 09 Nov 2023
Automatic Pronunciation Assessment -- A Review Yassine El Kheir Ahmed M. Ali Shammur A. Chowdhury 24 6 0 21 Oct 2023
Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables Ahmed Attia Yashish M. Siriwardena Carol Y. Espy-Wilson SSL 37 4 0 17 Sep 2023
Let There Be Sound: Reconstructing High Quality Speech from Silent Videos Ji-Hoon Kim Jaehun Kim Joon Son Chung 27 5 0 29 Aug 2023
Fairness and Privacy in Voice Biometrics:A Study of Gender Influences Using wav2vec 2.0 Oubaïda Chouchane Michele Panariello Chiara Galdi Massimiliano Todisco Nicholas W. D. Evans 27 2 0 27 Aug 2023
Implicit Self-supervised Language Representation for Spoken Language Diarization Student Member Ieee Jagabandhu Mishra S. M. I. S. R. Mahadeva Prasanna 14 0 0 21 Aug 2023
Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis Distance Sourya Dipta Das Yash Vadi Abhishek Unnam Kuldeep Yadav 20 1 0 09 Aug 2023
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation Zirui Ge Xinzhou Xu Haiyan Guo Tingting Wang Zhen Yang SSL 19 1 0 09 Aug 2023
Comparative Analysis of the wav2vec 2.0 Feature Extractor Peter Vieting Ralf Schluter Hermann Ney 20 2 0 08 Aug 2023
Investigation of Self-supervised Pre-trained Models for Classification of Voice Quality from Speech and Neck Surface Accelerometer Signals Sudarsana Reddy Kadiri Farhad Javanmardi P. Alku 22 6 0 06 Aug 2023
Towards spoken dialect identification of Irish Liam Lonergan Mengjie Qian Neasa Ní Chiaráin Christer Gobl A. N. Chasaide 16 4 0 14 Jul 2023
Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure Yikang Wang Hiromitsu Nishizaki Ming Li 39 0 0 04 Jul 2023
What Do Self-Supervised Speech Models Know About Words? Ankita Pasad C. Chien Shane Settle Karen Livescu SSL 33 26 0 30 Jun 2023
Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition Samuel Cahyawijaya Holy Lovenia Willy Chung Rita Frieske Zihan Liu Pascale Fung 37 1 0 26 Jun 2023
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies Yuya Yamamoto 25 2 0 22 Jun 2023
Unsupervised speech intelligibility assessment with utterance level alignment distance between teacher and learner Wav2Vec-2.0 representations Nayan Anand Meenakshi Sirigiraju Chiranjeevi Yarra 31 1 0 15 Jun 2023
What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model Mu Yang R. Shekar Okim Kang John H. L. Hansen 15 10 0 10 Jun 2023
Label Aware Speech Representation Learning For Language Identification Shikhar Vashishth Shikhar Bharadwaj Sriram Ganapathy Ankur Bapna Min Ma Wei Han Vera Axelrod Partha P. Talukdar SSL 23 4 0 07 Jun 2023
Investigating model performance in language identification: beyond simple error statistics S. Styles Victoria Y. H. Chua Fei Ting Woon Hexin Liu Leibny Paola García Perera Sanjeev Khudanpur Andy W. H. Khong Justin Dauwels 13 2 0 30 May 2023
From `Snippet-lects' to Doculects and Dialects: Leveraging Neural Representations of Speech for Placing Audio Signals in a Language Landscape Severine Guillaume Guillaume Wisniewski Alexis Michaud 18 2 0 29 May 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model Aoi Ito Shota Horiguchi SSL 19 2 0 24 May 2023
Scaling Speech Technology to 1,000+ Languages Vineel Pratap Andros Tjandra Bowen Shi Paden Tomasello Arun Babu ... Yossi Adi Xiaohui Zhang Wei-Ning Hsu Alexis Conneau Michael Auli VLM 77 300 0 22 May 2023
Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification Bing Han Zhengyang Chen Y. Qian 11 18 0 12 Apr 2023
The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework Zirui Ge Haiyan Guo Zhen Yang 29 1 0 19 Mar 2023
Towards multi-task learning of speech and speaker recognition Nik Vaessen David A. van Leeuwen CVBM 14 0 0 24 Feb 2023
Knowledge-aware Bayesian Co-attention for Multimodal Emotion Recognition Zihan Zhao Yu Wang Yanfeng Wang 20 18 0 20 Feb 2023