ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.06185
  4. Cited By
Exploring wav2vec 2.0 on speaker verification and language
  identification

Exploring wav2vec 2.0 on speaker verification and language identification

11 December 2020
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
ArXivPDFHTML

Papers citing "Exploring wav2vec 2.0 on speaker verification and language identification"

50 / 102 papers shown
Title
Improving Spoken Language Identification with Map-Mix
Improving Spoken Language Identification with Map-Mix
Shangeth Rajaa
K. Anandan
Swaraj Dalmia
Tarun Gupta
Chng Eng Siong
8
1
0
16 Feb 2023
KL Regularized Normalization Framework for Low Resource Tasks
KL Regularized Normalization Framework for Low Resource Tasks
Neeraj Kumar
Ankur Narang
Brejesh Lall
26
1
0
21 Dec 2022
Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy
  Environments
Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy Environments
Dominik Wagner
Ilja Baumann
Sebastian P. Bayerl
K. Riedhammer
Tobias Bocklet
34
2
0
16 Nov 2022
Introducing Semantics into Speech Encoders
Introducing Semantics into Speech Encoders
Derek Xu
Shuyan Dong
Changhan Wang
Suyoun Kim
Zhaojiang Lin
...
Alexei Baevski
Guan-Ting Lin
Hung-yi Lee
Yizhou Sun
Wei Wang
SSL
30
3
0
15 Nov 2022
Self-supervised learning with bi-label masked speech prediction for
  streaming multi-talker speech recognition
Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Zili Huang
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yiming Wang
Jinyu Li
Takuya Yoshioka
Xiaofei Wang
Peidong Wang
20
3
0
10 Nov 2022
Accidental Learners: Spoken Language Identification in Multilingual
  Self-Supervised Models
Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models
Travis M. Bartley
Fei Jia
Krishna C. Puvvada
Samuel Kriman
Boris Ginsburg
SSL
25
6
0
09 Nov 2022
Comparative layer-wise analysis of self-supervised speech models
Comparative layer-wise analysis of self-supervised speech models
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
30
109
0
08 Nov 2022
Adapting self-supervised models to multi-talker speech recognition using
  speaker embeddings
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Zili Huang
Desh Raj
Leibny Paola García-Perera
Sanjeev Khudanpur
80
23
0
01 Nov 2022
Universal speaker recognition encoders for different speech segments
  duration
Universal speaker recognition encoders for different speech segments duration
Sergey Novoselov
V. Volokhov
G. Lavrentyeva
4
2
0
28 Oct 2022
Influence of Utterance and Speaker Characteristics on the Classification
  of Children with Cleft Lip and Palate
Influence of Utterance and Speaker Characteristics on the Classification of Children with Cleft Lip and Palate
Ilja Baumann
Dominik Wagner
Franziska Braun
Sebastian P. Bayerl
Elmar Nöth
K. Riedhammer
Tobias Bocklet
16
2
0
28 Oct 2022
Opening the Black Box of wav2vec Feature Encoder
Opening the Black Box of wav2vec Feature Encoder
Kwanghee Choi
E. Yeo
SSL
38
15
0
27 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Evonne Lee
Guangzhi Sun
C. Zhang
P. Woodland
19
1
0
24 Oct 2022
Investigating self-supervised, weakly supervised and fully supervised
  training approaches for multi-domain automatic speech recognition: a study on
  Bangladeshi Bangla
Investigating self-supervised, weakly supervised and fully supervised training approaches for multi-domain automatic speech recognition: a study on Bangladeshi Bangla
Ahnaf Mozib Samin
M. Kobir
Md. Mushtaq Shahriyar Rafee
M. F. Ahmed
Mehedi Hasan
Partha Ghosh
Shafkat Kibria
M. S. Rahman
SSL
18
0
0
24 Oct 2022
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning
  Pretrained Models in Speech Analysis
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning Pretrained Models in Speech Analysis
Florian Lux
Ching-Yi Chen
Ngoc Thang Vu
23
1
0
21 Oct 2022
Large-scale learning of generalised representations for speaker
  recognition
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
25
6
0
20 Oct 2022
Extracting speaker and emotion information from self-supervised speech
  models via channel-wise correlations
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Themos Stafylakis
Ladislav Mošner
Sofoklis Kakouros
Oldrich Plchot
L. Burget
J. Černocký
SSL
32
8
0
15 Oct 2022
Exploration of A Self-Supervised Speech Model: A Study on Emotional
  Corpora
Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora
Yuanchao Li
Yumnah Mohamied
P. Bell
Catherine Lai
SSL
34
45
0
05 Oct 2022
Defend Data Poisoning Attacks on Voice Authentication
Defend Data Poisoning Attacks on Voice Authentication
Ke Li
Cameron Baird
D. Lin
AAML
38
9
0
09 Sep 2022
Fully Automated End-to-End Fake Audio Detection
Fully Automated End-to-End Fake Audio Detection
Chenglong Wang
Jiangyan Yi
J. Tao
Haiyang Sun
Xun Chen
Zhengkun Tian
Haoxin Ma
Cunhang Fan
Ruibo Fu
26
28
0
20 Aug 2022
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou
Xiangming Gu
Ye Wang
30
21
0
20 Jul 2022
Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion
  Recognition
Multi-level Fusion of Wav2vec 2.0 and BERT for Multimodal Emotion Recognition
Zihan Zhao
Yanfeng Wang
Yu Wang
14
33
0
11 Jul 2022
A Comparative Study of Self-supervised Speech Representation Based Voice
  Conversion
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
T. Toda
16
15
0
10 Jul 2022
Tandem Multitask Training of Speaker Diarisation and Speech Recognition
  for Meeting Transcription
Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
Xianrui Zheng
C. Zhang
P. Woodland
26
16
0
08 Jul 2022
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech
  Self-Supervised Learning
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Yeonghyeon Lee
Kangwook Jang
Jahyun Goo
Youngmoon Jung
Hoi-Rim Kim
17
28
0
01 Jul 2022
Low-resource Accent Classification in Geographically-proximate Settings:
  A Forensic and Sociophonetics Perspective
Low-resource Accent Classification in Geographically-proximate Settings: A Forensic and Sociophonetics Perspective
Qingcheng Zeng
Dading Chong
Peilin Zhou
Jie Yang
25
3
0
26 Jun 2022
Transducer-based language embedding for spoken language identification
Transducer-based language embedding for spoken language identification
Peng Shen
Xugang Lu
Hisashi Kawai
50
6
0
08 Apr 2022
Detecting Vocal Fatigue with Neural Embeddings
Detecting Vocal Fatigue with Neural Embeddings
Sebastian P. Bayerl
Dominik Wagner
Ilja Baumann
K. Riedhammer
Tobias Bocklet
18
11
0
07 Apr 2022
Cross-lingual Self-Supervised Speech Representations for Improved
  Dysarthric Speech Recognition
Cross-lingual Self-Supervised Speech Representations for Improved Dysarthric Speech Recognition
Abner Hernandez
Paula Andrea Pérez-Toro
Elmar Nöth
J. Orozco-Arroyave
Andreas Maier
S. Yang
23
38
0
04 Apr 2022
Anti-Spoofing Using Transfer Learning with Variational Information
  Bottleneck
Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck
Youngsik Eom
Yeonghyeon Lee
Ji Sub Um
Hoi-Rim Kim
29
25
0
04 Apr 2022
Improving Mispronunciation Detection with Wav2vec2-based Momentum
  Pseudo-Labeling for Accentedness and Intelligibility Assessment
Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment
Mu Yang
K. Hirschi
S. Looney
Okim Kang
John H. L. Hansen
35
15
0
29 Mar 2022
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Sergey Novoselov
G. Lavrentyeva
Anastasia Avdeeva
V. Volokhov
Aleksei Gusev
ViT
13
18
0
28 Mar 2022
Training speaker recognition systems with limited data
Training speaker recognition systems with limited data
Nik Vaessen
David A. van Leeuwen
11
6
0
28 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
  for Semantic and Generative Capabilities
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
20
109
0
14 Mar 2022
GCNet: Graph Completion Network for Incomplete Multimodal Learning in
  Conversation
GCNet: Graph Completion Network for Incomplete Multimodal Learning in Conversation
Zheng Lian
Lang Chen
Guoying Zhao
B. Liu
J. Tao
25
83
0
04 Mar 2022
Automatic speaker verification spoofing and deepfake detection using
  wav2vec 2.0 and data augmentation
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
34
151
0
24 Feb 2022
Self-Supervised Representation Learning for Speech Using Visual
  Grounding and Masked Language Modeling
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling
Puyuan Peng
David Harwath
SSL
35
26
0
07 Feb 2022
Multi-Variant Consistency based Self-supervised Learning for Robust
  Automatic Speech Recognition
Multi-Variant Consistency based Self-supervised Learning for Robust Automatic Speech Recognition
Changfeng Gao
Gaofeng Cheng
Pengyuan Zhang
25
4
0
23 Dec 2021
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion
  Recognition, Speaker Verification and Spoken Language Understanding
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding
Yingzhi Wang
Abdelmoumene Boumadane
A. Heba
20
146
0
04 Nov 2021
Neural Analysis and Synthesis: Reconstructing Speech from
  Self-Supervised Representations
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
37
150
0
27 Oct 2021
Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks
Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks
Sangeeta Srivastava
Yun Wang
Andros Tjandra
Anurag Kumar
Chunxi Liu
Kritika Singh
Yatharth Saraf
SSL
33
24
0
14 Oct 2021
Exploring Wav2vec 2.0 fine-tuning for improved speech emotion
  recognition
Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Li-Wei Chen
Alexander I. Rudnicky
VLM
19
121
0
12 Oct 2021
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised
  Speech Representations
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
Hung-yi Lee
Shinji Watanabe
T. Toda
25
40
0
12 Oct 2021
Large-scale Self-Supervised Speech Representation Learning for Automatic
  Speaker Verification
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification
Zhengyang Chen
Sanyuan Chen
Yu-Huan Wu
Yao Qian
Chengyi Wang
Shujie Liu
Y. Qian
Michael Zeng
SSL
26
124
0
12 Oct 2021
Injecting Text and Cross-lingual Supervision in Few-shot Learning from
  Self-Supervised Models
Injecting Text and Cross-lingual Supervision in Few-shot Learning from Self-Supervised Models
Matthew Wiesner
Desh Raj
Sanjeev Khudanpur
56
6
0
10 Oct 2021
Multi-task Voice Activated Framework using Self-supervised Learning
Multi-task Voice Activated Framework using Self-supervised Learning
Shehzeen Samarah Hussain
V. Nguyen
Shuhua Zhang
Erik M. Visser
SSL
19
12
0
03 Oct 2021
Fine-tuning wav2vec2 for speaker recognition
Fine-tuning wav2vec2 for speaker recognition
Nik Vaessen
David A. van Leeuwen
36
107
0
30 Sep 2021
Direct speech-to-speech translation with discrete units
Direct speech-to-speech translation with discrete units
Ann Lee
Peng-Jen Chen
Changhan Wang
Jiatao Gu
Sravya Popuri
...
Yossi Adi
Qing He
Yun Tang
J. Pino
Wei-Ning Hsu
33
180
0
12 Jul 2021
Improved Language Identification Through Cross-Lingual Self-Supervised
  Learning
Improved Language Identification Through Cross-Lingual Self-Supervised Learning
Andros Tjandra
Diptanu Gon Choudhury
Frank Zhang
Kritika Singh
Alexis Conneau
Alexei Baevski
Assaf Sela
Yatharth Saraf
Michael Auli
VLM
SSL
24
35
0
08 Jul 2021
Pretext Tasks selection for multitask self-supervised speech
  representation learning
Pretext Tasks selection for multitask self-supervised speech representation learning
Salah Zaiem
Titouan Parcollet
S. Essid
Abdel Heba
SSL
14
12
0
01 Jul 2021
Unsupervised Speech Recognition
Unsupervised Speech Recognition
Alexei Baevski
Wei-Ning Hsu
Alexis Conneau
Michael Auli
SSL
20
270
0
24 May 2021
Previous
123
Next