ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.15188
  4. Cited By
Overview of Speaker Modeling and Its Applications: From the Lens of Deep
  Speaker Representation Learning

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning

21 July 2024
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
ArXivPDFHTML

Papers citing "Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning"

50 / 108 papers shown
Title
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
685
6,079
0
29 Apr 2021
AdaSpeech: Adaptive Text to Speech for Custom Voice
AdaSpeech: Adaptive Text to Speech for Custom Voice
Mingjian Chen
Xu Tan
Bohan Li
Yanqing Liu
Tao Qin
Sheng Zhao
Tie-Yan Liu
VLM
DiffM
84
192
0
01 Mar 2021
WeNet: Production oriented Streaming and Non-streaming End-to-End Speech
  Recognition Toolkit
WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit
Zhuoyuan Yao
Di Wu
Xiong Wang
Binbin Zhang
Fan Yu
Chao Yang
Zhendong Peng
Xiaoyu Chen
Lei Xie
X. Lei
73
267
0
02 Feb 2021
A Review of Speaker Diarization: Recent Advances with Deep Learning
A Review of Speaker Diarization: Recent Advances with Deep Learning
Tae Jin Park
Naoyuki Kanda
Dimitrios Dimitriadis
Kyu Jeong Han
Shinji Watanabe
Shrikanth Narayanan
VLM
326
332
0
24 Jan 2021
Self-supervised Text-independent Speaker Verification using Prototypical
  Momentum Contrastive Learning
Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning
Wei Xia
Chunlei Zhang
Chao Weng
Meng Yu
Dong Yu
SSL
59
79
0
13 Dec 2020
Exploring wav2vec 2.0 on speaker verification and language
  identification
Exploring wav2vec 2.0 on speaker verification and language identification
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
133
203
0
11 Dec 2020
MLS: A Large-Scale Multilingual Dataset for Speech Research
MLS: A Large-Scale Multilingual Dataset for Speech Research
Vineel Pratap
Qiantong Xu
Anuroop Sriram
Gabriel Synnaeve
R. Collobert
AuLLM
86
503
0
07 Dec 2020
Playing a Part: Speaker Verification at the Movies
Playing a Part: Speaker Verification at the Movies
A. Brown
Jaesung Huh
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
38
23
0
29 Oct 2020
An iterative framework for self-supervised deep speaker representation
  learning
An iterative framework for self-supervised deep speaker representation learning
Danwei Cai
Weiqing Wang
Ming Li
SSL
41
37
0
25 Oct 2020
Y-Vector: Multiscale Waveform Encoder for Speaker Embedding
Y-Vector: Multiscale Waveform Encoder for Speaker Embedding
Ge Zhu
Fei Jiang
Z. Duan
34
25
0
24 Oct 2020
The IDLAB VoxCeleb Speaker Recognition Challenge 2020 System Description
The IDLAB VoxCeleb Speaker Recognition Challenge 2020 System Description
Jenthe Thienpondt
Brecht Desplanques
Kris Demuynck
58
49
0
23 Oct 2020
The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and
  Quality-Aware Score Calibration in DNN Based Speaker Verification
The IDLAB VoxSRC-20 Submission: Large Margin Fine-Tuning and Quality-Aware Score Calibration in DNN Based Speaker Verification
Jenthe Thienpondt
Brecht Desplanques
Kris Demuynck
61
84
0
21 Oct 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
104
322
0
09 Aug 2020
Concept Bottleneck Models
Concept Bottleneck Models
Pang Wei Koh
Thao Nguyen
Y. S. Tang
Stephen Mussmann
Emma Pierson
Been Kim
Percy Liang
96
823
0
09 Jul 2020
End-to-End Speaker Diarization for an Unknown Number of Speakers with
  Encoder-Decoder Based Attractors
End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors
Shota Horiguchi
Yusuke Fujita
Shinji Watanabe
Yawen Xue
Kenji Nagamatsu
126
190
0
20 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
223
3,139
0
16 May 2020
Target-Speaker Voice Activity Detection: a Novel Approach for
  Multi-Speaker Diarization in a Dinner Party Scenario
Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Ivan Medennikov
M. Korenevsky
Tatiana Prisyach
Yuri Y. Khokhlov
Mariya Korenevskaya
...
Anton Mitrofanov
A. Andrusenko
Ivan Podluzhny
A. Laptev
A. Romanenko
45
199
0
14 May 2020
ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in
  TDNN Based Speaker Verification
ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification
Brecht Desplanques
Jenthe Thienpondt
Kris Demuynck
74
1,338
0
14 May 2020
From Speaker Verification to Multispeaker Speech Synthesis, Deep
  Transfer with Feedback Constraint
From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint
Zexin Cai
Chuxiong Zhang
Ming Li
55
42
0
10 May 2020
The Attacker's Perspective on Automatic Speaker Verification: An
  Overview
The Attacker's Perspective on Automatic Speaker Verification: An Overview
Rohan Kumar Das
Xiaohai Tian
Tomi Kinnunen
Haizhou Li
AAML
50
80
0
19 Apr 2020
SpEx: Multi-Scale Time Domain Speaker Extraction Network
SpEx: Multi-Scale Time Domain Speaker Extraction Network
Chenglin Xu
Wei Rao
Eng Siong Chng
Haizhou Li
50
169
0
17 Apr 2020
Meta-Learning for Short Utterance Speaker Recognition with Imbalance
  Length Pairs
Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs
Seong Min Kye
Youngmoon Jung
Haebeom Lee
Sung Ju Hwang
Hoirin Kim
79
50
0
06 Apr 2020
In defence of metric learning for speaker recognition
In defence of metric learning for speaker recognition
Joon Son Chung
Jaesung Huh
Seongkyu Mun
Minjae Lee
Hee-Soo Heo
Soyeon Choe
Chiheon Ham
Sung-Ye Jung
Bong-Jin Lee
Icksang Han
64
436
0
26 Mar 2020
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
369
18,778
0
13 Feb 2020
The FFSVC 2020 Evaluation Plan
The FFSVC 2020 Evaluation Plan
Xiaoyi Qin
Ming Li
Hui Bu
Rohan Kumar Das
Wei Rao
Shrikanth Narayanan
Haizhou Li
61
21
0
02 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
605
4,822
0
23 Jan 2020
Libri-Light: A Benchmark for ASR with Limited or No Supervision
Libri-Light: A Benchmark for ASR with Limited or No Supervision
Jacob Kahn
M. Rivière
Weiyi Zheng
Evgeny Kharitonov
Qiantong Xu
...
Tatiana Likhomanenko
Gabriel Synnaeve
Armand Joulin
Abdel-rahman Mohamed
Emmanuel Dupoux
AuLLM
67
672
0
17 Dec 2019
HI-MIA : A Far-field Text-Dependent Speaker Verification Database and
  the Baselines
HI-MIA : A Far-field Text-Dependent Speaker Verification Database and the Baselines
Xiaoyi Qin
Hui Bu
Ming Li
51
68
0
03 Dec 2019
Momentum Contrast for Unsupervised Visual Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
201
12,085
0
13 Nov 2019
CN-CELEB: a challenging Chinese speaker recognition dataset
CN-CELEB: a challenging Chinese speaker recognition dataset
Yue Fan
Jiawen Kang
Lantian Li
Keliang Li
Haolin Chen
Sitong Cheng
Pengyuan Zhang
Ziya Zhou
Yunqi Cai
Dong Wang
62
205
0
31 Oct 2019
BUT System Description to VoxCeleb Speaker Recognition Challenge 2019
BUT System Description to VoxCeleb Speaker Recognition Challenge 2019
Hossein Zeinali
Shuai Wang
Anna Silnova
P. Matejka
Oldrich Plchot
DRL
69
247
0
16 Oct 2019
Deep Latent Space Learning for Cross-modal Mapping of Audio and Visual
  Signals
Deep Latent Space Learning for Cross-modal Mapping of Audio and Visual Signals
Shah Nawaz
Muhammad Kamran Janjua
I. Gallo
Arif Mahmood
Alessandro Calefati
54
33
0
18 Sep 2019
Probing the Information Encoded in X-vectors
Probing the Information Encoded in X-vectors
Desh Raj
David Snyder
Daniel Povey
Sanjeev Khudanpur
89
87
0
13 Sep 2019
End-to-End Neural Speaker Diarization with Self-attention
End-to-End Neural Speaker Diarization with Self-attention
Yusuke Fujita
Naoyuki Kanda
Shota Horiguchi
Yawen Xue
Kenji Nagamatsu
Shinji Watanabe
217
240
0
13 Sep 2019
Personal VAD: Speaker-Conditioned Voice Activity Detection
Personal VAD: Speaker-Conditioned Voice Activity Detection
Shaojin Ding
Quan Wang
Shuo-yiin Chang
Li Wan
Ignacio López Moreno
44
75
0
12 Aug 2019
Cross-lingual Text-independent Speaker Verification using Unsupervised
  Adversarial Discriminative Domain Adaptation
Cross-lingual Text-independent Speaker Verification using Unsupervised Adversarial Discriminative Domain Adaptation
Wei Xia
Jing-ling Huang
John H. L. Hansen
72
58
0
05 Aug 2019
RawNet: Advanced end-to-end deep neural network using raw waveforms for
  text-independent speaker verification
RawNet: Advanced end-to-end deep neural network using raw waveforms for text-independent speaker verification
Jee-weon Jung
Hee-Soo Heo
Ju-ho Kim
Hye-jin Shim
Ha-Jin Yu
56
142
0
17 Apr 2019
Self-supervised speaker embeddings
Self-supervised speaker embeddings
Themos Stafylakis
Johan Rohdin
Oldrich Plchot
Petr Mizera
L. Burget
SSL
31
48
0
06 Apr 2019
Utterance-level Aggregation For Speaker Recognition In The Wild
Utterance-level Aggregation For Speaker Recognition In The Wild
Weidi Xie
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
59
344
0
26 Feb 2019
Deep Speaker Embedding Learning with Multi-Level Pooling for
  Text-Independent Speaker Verification
Deep Speaker Embedding Learning with Multi-Level Pooling for Text-Independent Speaker Verification
Yun Tang
Guo-Hong Ding
Jing Huang
Xiaodong He
Bowen Zhou
56
82
0
21 Feb 2019
Learning Speaker Representations with Mutual Information
Learning Speaker Representations with Mutual Information
Mirco Ravanelli
Yoshua Bengio
SSL
DRL
69
91
0
01 Dec 2018
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned
  Spectrogram Masking
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Quan Wang
Hannah Muckenhirn
K. Wilson
Prashant Sridhar
Zelin Wu
J. Hershey
Rif A. Saurous
Ron J. Weiss
Ye Jia
Ignacio López Moreno
68
368
0
11 Oct 2018
Attention Mechanism in Speaker Recognition: What Does It Learn in Deep
  Speaker Embedding?
Attention Mechanism in Speaker Recognition: What Does It Learn in Deep Speaker Embedding?
Qiongqiong Wang
K. Okabe
Kong Aik Lee
Hitoshi Yamamoto
Takafumi Koshinaka
49
31
0
25 Sep 2018
AISHELL-2: Transforming Mandarin ASR Research Into Industrial Scale
AISHELL-2: Transforming Mandarin ASR Research Into Industrial Scale
Jiayu Du
Xingyu Na
Xuechen Liu
Hui Bu
VLM
54
287
0
31 Aug 2018
Speaker Recognition from Raw Waveform with SincNet
Speaker Recognition from Raw Waveform with SincNet
Mirco Ravanelli
Yoshua Bengio
159
715
0
29 Jul 2018
Disjoint Mapping Network for Cross-modal Matching of Voices and Faces
Disjoint Mapping Network for Cross-modal Matching of Voices and Faces
Yandong Wen
Mahmoud Al Ismail
Weiyang Liu
Bhiksha Raj
Rita Singh
FedML
49
71
0
12 Jul 2018
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
256
830
0
12 Jun 2018
Voices Obscured in Complex Environmental Settings (VOICES) corpus
Voices Obscured in Complex Environmental Settings (VOICES) corpus
Colleen Richey
Maria Artigas
Zeb Armstrong
C. Bartels
H. Franco
...
Julien van Hout
Paul Gamble
Jeff Hetherly
Cory Stephenson
Karl S. Ni
56
127
0
13 Apr 2018
ESPnet: End-to-End Speech Processing Toolkit
ESPnet: End-to-End Speech Processing Toolkit
Shinji Watanabe
Takaaki Hori
Shigeki Karita
Tomoki Hayashi
Jiro Nishitoba
...
Jahn Heymann
Sanjeev Khudanpur
Nanxin Chen
Adithya Renduchintala
Tsubasa Ochiai
VLM
109
1,507
0
30 Mar 2018
VoxCeleb: a large-scale speaker identification dataset
VoxCeleb: a large-scale speaker identification dataset
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
125
2,274
0
26 Jun 2017
Previous
123
Next