ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset
v1v2 (latest)

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,111 papers shown
Title
Detecting Vocal Fatigue with Neural Embeddings
Detecting Vocal Fatigue with Neural Embeddings
Sebastian P. Bayerl
Dominik Wagner
Ilja Baumann
Korbinian Riedhammer
Tobias Bocklet
64
11
0
07 Apr 2022
Design Guidelines for Inclusive Speaker Verification Evaluation Datasets
Design Guidelines for Inclusive Speaker Verification Evaluation Datasets
W. Hutiri
Lauriane Gorce
Aaron Yi Ding
62
7
0
05 Apr 2022
Frequency and Multi-Scale Selective Kernel Attention for Speaker
  Verification
Frequency and Multi-Scale Selective Kernel Attention for Speaker Verification
Sung Hwan Mun
Jee-weon Jung
Min Hyun Han
N. Kim
118
21
0
03 Apr 2022
Improved Relation Networks for End-to-End Speaker Verification and
  Identification
Improved Relation Networks for End-to-End Speaker Verification and Identification
Ashutosh Chaubey
Sparsh Sinha
Susmita Ghose
58
3
0
31 Mar 2022
Open Source MagicData-RAMC: A Rich Annotated Mandarin
  Conversational(RAMC) Speech Dataset
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Zehui Yang
Yifan Chen
Lei Luo
Runyan Yang
Lingxuan Ye
...
Yaohui Jin
Qingqing Zhang
Pengyuan Zhang
Lei Xie
Yonghong Yan
69
51
0
31 Mar 2022
A Comparative Study on Speaker-attributed Automatic Speech Recognition
  in Multi-party Meetings
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings
Fan Yu
Zhihao Du
Shiliang Zhang
Yuxiao Lin
Linfu Xie
42
15
0
31 Mar 2022
Improving Distortion Robustness of Self-supervised Speech Processing
  Tasks with Domain Adaptation
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation
Kuan Po Huang
Yuanbin Fu
Yu Zhang
Hung-yi Lee
87
28
0
30 Mar 2022
Decomposed Temporal Dynamic CNN: Efficient Time-Adaptive Network for
  Text-Independent Speaker Verification Explained with Speaker Activation Map
Decomposed Temporal Dynamic CNN: Efficient Time-Adaptive Network for Text-Independent Speaker Verification Explained with Speaker Activation Map
Seong-Hu Kim
Hyeonuk Nam
Yong-Hwa Park
80
9
0
29 Mar 2022
NeuraGen-A Low-Resource Neural Network based approach for Gender
  Classification
NeuraGen-A Low-Resource Neural Network based approach for Gender Classification
Shankhanil Ghosh
Chhanda Saha
Nagamani Molakathaala
16
2
0
29 Mar 2022
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic
  Speaker Verification
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification
Yang Zhang
Zhiqiang Lv
Haibin Wu
Shanshan Zhang
Pengfei Hu
Zhiyong Wu
Hung-yi Lee
Helen Meng
ViT
100
137
0
29 Mar 2022
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Sergey Novoselov
G. Lavrentyeva
Anastasia Avdeeva
V. Volokhov
Aleksei Gusev
ViT
58
18
0
28 Mar 2022
Analyzing Language-Independent Speaker Anonymization Framework under
  Unseen Conditions
Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
70
11
0
28 Mar 2022
Training speaker recognition systems with limited data
Training speaker recognition systems with limited data
Nik Vaessen
David A. van Leeuwen
45
6
0
28 Mar 2022
Thin-Plate Spline Motion Model for Image Animation
Thin-Plate Spline Motion Model for Image Animation
Jian Zhao
Hui Zhang
90
196
0
27 Mar 2022
End-to-End Active Speaker Detection
End-to-End Active Speaker Detection
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
99
28
0
27 Mar 2022
A Speech Representation Anonymization Framework via Selective Noise
  Perturbation
A Speech Representation Anonymization Framework via Selective Noise Perturbation
Minh Tran
M. Soleymani
74
5
0
26 Mar 2022
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio
  Representation Learning
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning
Sreyan Ghosh
Ashish Seth
and Deepak Mittal
Maneesh Singh
S. Umesh
SSL
62
6
0
25 Mar 2022
3D GAN Inversion for Controllable Portrait Image Animation
3D GAN Inversion for Controllable Portrait Image Animation
Connor Z. Lin
David B. Lindell
E. R. Chan
Gordon Wetzstein
3DH
72
63
0
25 Mar 2022
The VoicePrivacy 2022 Challenge Evaluation Plan
The VoicePrivacy 2022 Challenge Evaluation Plan
N. Tomashenko
Xin Wang
Xiaoxiao Miao
Hubert Nourtel
Pierre Champion
Massimiliano Todisco
Emmanuel Vincent
Nicholas W. D. Evans
Junichi Yamagishi
J. Bonastre
117
63
0
23 Mar 2022
Estimation of speaker age and height from speech signal using bi-encoder
  transformer mixture model
Estimation of speaker age and height from speech signal using bi-encoder transformer mixture model
Tarun Gupta
Duc-Tuan Truong
Tran The Anh
Chng Eng Siong
57
16
0
22 Mar 2022
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial
  Fine-Tuning Results for Child Speech Synthesis
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis
Rishabh Jain
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
69
14
0
22 Mar 2022
Automated detection of foreground speech with wearable sensing in
  everyday home environments: A transfer learning approach
Automated detection of foreground speech with wearable sensing in everyday home environments: A transfer learning approach
Dawei Liang
Zifan Xu
Yinuo Chen
Rebecca Adaimi
David Harwath
Edison Thomaz
71
1
0
21 Mar 2022
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
81
7
0
21 Mar 2022
ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis
ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis
Jinlong Xue
Yayue Deng
Yichen Han
Ya Li
Jianqing Sun
Jiaen Liang
51
8
0
20 Mar 2022
Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Cho-Ying Wu
Chin-Cheng Hsu
Ulrich Neumann
CVBM
66
14
0
18 Mar 2022
TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
Ruiteng Zhang
Jianguo Wei
Xugang Lu
Wenhuan Lu
Di Jin
Junhai Xu
Lin Zhang
Y. Ji
Jianwu Dang
56
4
0
17 Mar 2022
Pushing the limits of raw waveform speaker recognition
Pushing the limits of raw waveform speaker recognition
Jee-weon Jung
You Jin Kim
Hee-Soo Heo
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
88
90
0
16 Mar 2022
Depth-Aware Generative Adversarial Network for Talking Head Video
  Generation
Depth-Aware Generative Adversarial Network for Talking Head Video Generation
Fa-Ting Hong
Longhao Zhang
Li Shen
Dan Xu
3DHCVBM
98
177
0
13 Mar 2022
StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via
  Pre-trained StyleGAN
StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN
Fei Yin
Yong Zhang
Xiaodong Cun
Ming Cao
Yanbo Fan
Xuanxia Wang
Qingyan Bai
Baoyuan Wu
Jue Wang
Yujiu Yang
CVBM
119
175
0
08 Mar 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with
  Articulatory Features
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features
Florian Lux
Ngoc Thang Vu
99
29
0
07 Mar 2022
Voice-Face Homogeneity Tells Deepfake
Voice-Face Homogeneity Tells Deepfake
Harry Cheng
Yangyang Guo
Tianyi Wang
Qi Li
Xiaojun Chang
Liqiang Nie
CVBM
97
73
0
04 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
102
109
0
02 Mar 2022
TRILLsson: Distilled Universal Paralinguistic Speech Representations
TRILLsson: Distilled Universal Paralinguistic Speech Representations
Joel Shor
Subhashini Venugopalan
79
41
0
01 Mar 2022
Magnitude-aware Probabilistic Speaker Embeddings
Magnitude-aware Probabilistic Speaker Embeddings
Nikita Kuzmin
Igor Fedorov
A. Sholokhov
55
7
0
28 Feb 2022
Language-Independent Speaker Anonymization Approach using
  Self-Supervised Pre-Trained Models
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
175
25
0
26 Feb 2022
Improving fairness in speaker verification via Group-adapted Fusion
  Network
Improving fairness in speaker verification via Group-adapted Fusion Network
Hua Shen
Yuguang Yang
G. Sun
Ryan Langman
Eunjung Han
J. Droppo
A. Stolcke
37
16
0
23 Feb 2022
Thinking the Fusion Strategy of Multi-reference Face Reenactment
Thinking the Fusion Strategy of Multi-reference Face Reenactment
T. Yashima
T. Narihira
Tamaki Kojima
DiffMCVBM
66
1
0
22 Feb 2022
Contrastive-mixup learning for improved speaker verification
Contrastive-mixup learning for improved speaker verification
Xin Zhang
Minho Jin
R. Cheng
Ruirui Li
Eunjung Han
A. Stolcke
AAMLSSL
58
11
0
22 Feb 2022
Speaker Identity Preservation in Dysarthric Speech Reconstruction by
  Adversarial Speaker Adaptation
Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation
Disong Wang
Songxiang Liu
Xixin Wu
Hui Lu
Lifa Sun
Xunying Liu
Helen Meng
72
5
0
18 Feb 2022
Learning Temporal Point Processes for Efficient Retrieval of Continuous
  Time Event Sequences
Learning Temporal Point Processes for Efficient Retrieval of Continuous Time Event Sequences
Vinayak Gupta
Srikanta J. Bedathur
A. De
AI4TS
53
13
0
17 Feb 2022
I'm Hearing (Different) Voices: Anonymous Voices to Protect User Privacy
I'm Hearing (Different) Voices: Anonymous Voices to Protect User Privacy
H.C.M. Turner
Giulio Lovisotto
Simon Eberz
Ivan Martinovic
30
1
0
13 Feb 2022
Learnable Nonlinear Compression for Robust Speaker Verification
Learnable Nonlinear Compression for Robust Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
58
2
0
10 Feb 2022
Tubes Among Us: Analog Attack on Automatic Speaker Identification
Tubes Among Us: Analog Attack on Automatic Speaker Identification
Shimaa Ahmed
Yash R. Wani
Ali Shahin Shamsabadi
Mohammad Yaghini
Ilia Shumailov
Nicolas Papernot
Kassem Fawaz
AAML
62
4
0
06 Feb 2022
The CUHK-TENCENT speaker diarization system for the ICASSP 2022
  multi-channel multi-party meeting transcription challenge
The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge
Naijun Zheng
Na Li
Xixin Wu
Lingwei Meng
Jiawen Kang
Haibin Wu
Chao Weng
Jane Polak Scowcroft
Helen Meng
73
10
0
04 Feb 2022
MFA: TDNN with Multi-scale Frequency-channel Attention for
  Text-independent Speaker Verification with Short Utterances
MFA: TDNN with Multi-scale Frequency-channel Attention for Text-independent Speaker Verification with Short Utterances
Tianchi Liu
Rohan Kumar Das
Kong Aik Lee
Haizhou Li
134
72
0
03 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Speaker Normalization for Self-supervised Speech Emotion Recognition
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
80
54
0
02 Feb 2022
Finding Directions in GAN's Latent Space for Neural Face Reenactment
Finding Directions in GAN's Latent Space for Neural Face Reenactment
Stella Bounareli
Vasileios Argyriou
Georgios Tzimiropoulos
3DHCVBM
102
35
0
31 Jan 2022
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention
Artem Gorodetskii
Ivan Ozhiganov
115
2
0
25 Jan 2022
SASV Challenge 2022: A Spoofing Aware Speaker Verification Challenge
  Evaluation Plan
SASV Challenge 2022: A Spoofing Aware Speaker Verification Challenge Evaluation Plan
Jee-weon Jung
Hemlata Tak
Hye-jin Shim
Hee-Soo Heo
Bong-Jin Lee
Soo-Whan Chung
Hong-Goo Kang
Ha-Jin Yu
Nicholas W. D. Evans
Tomi Kinnunen
98
31
0
25 Jan 2022
Bias in Automated Speaker Recognition
Bias in Automated Speaker Recognition
Wiebke Toussaint
Aaron Yi Ding
CVBM
70
44
0
24 Jan 2022
Previous
123...111213...212223
Next