Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.08612
Cited By
v1
v2 (latest)
VoxCeleb: a large-scale speaker identification dataset
26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VoxCeleb: a large-scale speaker identification dataset"
50 / 1,111 papers shown
Title
CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
Haoning Zhu
Wayne Wu
Wentao Zhu
Liming Jiang
Siwei Tang
Li Zhang
Ziwei Liu
Chen Change Loy
140
171
0
25 Jul 2022
Fine-grained Early Frequency Attention for Deep Speaker Recognition
Amirhossein Hajavi
Ali Etemad
88
4
0
20 Jul 2022
Adversarial Reweighting for Speaker Verification Fairness
Minho Jin
Chelsea J.-T. Ju
Zeya Chen
Yi-Chieh Liu
J. Droppo
A. Stolcke
46
5
0
15 Jul 2022
The DKU-OPPO System for the 2022 Spoofing-Aware Speaker Verification Challenge
Xingming Wang
Xiaoyi Qin
Yikang Wang
Yunfei Xu
Ming Li
113
14
0
15 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
112
43
0
14 Jul 2022
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings
Xiaoyi Qin
Na Li
Chao Weng
Jane Polak Scowcroft
Ming Li
123
18
0
13 Jul 2022
Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning
Théo Lepage
Réda Dehak
SSL
147
12
0
12 Jul 2022
PoeticTTS -- Controllable Poetry Reading for Literary Studies
Julia Koch
Florian Lux
Nadja Schauffler
T. Bernhart
Felix Dieterle
Jonas Kuhn
Sandra Richter
Gabriel Viehhauser
Ngoc Thang Vu
66
5
0
11 Jul 2022
Speaker Anonymization with Phonetic Intermediate Representations
Sarina Meyer
Florian Lux
Pavel Denisov
Julia Koch
Pascal Tilli
Ngoc Thang Vu
83
28
0
11 Jul 2022
The HCCL System for the NIST SRE21
Zhuo Li
Runqiu Xiao
Hangting Chen
Zhenduo Zhao
Zi-qiang Zhang
Wenchao Wang
57
0
0
11 Jul 2022
Multi-Frequency Information Enhanced Channel Attention Module for Speaker Representation Learning
Mufan Sang
John H. L. Hansen
64
13
0
10 Jul 2022
Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification
Long Chen
Yi Meng
Venkatesh Ravichandran
A. Stolcke
31
1
0
08 Jul 2022
Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion
Ahmad Aloradi
Wolfgang Mack
Mohamed Elminshawi
Emanuel Habets
63
5
0
28 Jun 2022
Domain Agnostic Few-shot Learning for Speaker Verification
Seunghan Yang
Debasmit Das
Jang Hyun Cho
Hyoungwoo Park
Sungrack Yun
OOD
70
7
0
28 Jun 2022
Extended U-Net for Speaker Verification in Noisy Environments
Ju-ho Kim
Ju-Sung Heo
Hye-jin Shim
Ha-Jin Yu
39
16
0
27 Jun 2022
Transport-Oriented Feature Aggregation for Speaker Embedding Learning
Yusheng Tian
Jingyu Li
Tan Lee
39
1
0
26 Jun 2022
Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech
Florian Lux
Julia Koch
Ngoc Thang Vu
77
20
0
24 Jun 2022
Modeling Continuous Time Sequences with Intermittent Observations using Marked Temporal Point Processes
Vinayak Gupta
Srikanta J. Bedathur
Sourangshu Bhattacharya
A. De
AI4TS
92
13
0
23 Jun 2022
Towards End-to-End Private Automatic Speaker Recognition
Francisco Teixeira
A. Abad
Bhiksha Raj
Isabel Trancoso
97
10
0
23 Jun 2022
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion
Haibin Wu
Jiawen Kang
Lingwei Meng
Yang Zhang
Xixin Wu
Zhiyong Wu
Hung-yi Lee
Helen Meng
69
9
0
18 Jun 2022
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems
Danwei Cai
Zexin Cai
Ming Li
93
10
0
18 Jun 2022
Semi-supervised Time Domain Target Speaker Extraction with Attention
Zhepei Wang
Ritwik Giri
Shrikant Venkataramani
Umut Isik
J. Valin
Paris Smaragdis
Mike Goodwin
A. Krishnaswamy
59
7
0
18 Jun 2022
HairFIT: Pose-Invariant Hairstyle Transfer via Flow-based Hair Alignment and Semantic-Region-Aware Inpainting
Chaeyeon Chung
Taewoo Kim
Hyelin Nam
Seunghwan Choi
Gyojung Gu
S. Park
Jaegul Choo
3DH
68
7
0
17 Jun 2022
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
Joanna Hong
Minsu Kim
Y. Ro
CVBM
DiffM
67
8
0
15 Jun 2022
The Influence of Dataset Partitioning on Dysfluency Detection Systems
Sebastian P. Bayerl
Dominik Wagner
Elmar Nöth
Tobias Bocklet
Korbinian Riedhammer
82
20
0
07 Jun 2022
Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Feng Wang
Jiashui Wang
AAML
110
40
0
07 Jun 2022
DT-SV: A Transformer-based Time-domain Approach for Speaker Verification
Nan Zhang
Jianzong Wang
Zhenhou Hong
Chendong Zhao
Xiaoyang Qu
Jing Xiao
116
5
0
26 May 2022
Deep Learning for Visual Speech Analysis: A Survey
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Y. Guo
Xin Xu
M. Pietikäinen
Li Liu
VLM
98
36
0
22 May 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
287
368
0
21 May 2022
Dynamic Recognition of Speakers for Consent Management by Contrastive Embedding Replay
Arash Shahmansoori
U. Roedig
66
1
0
17 May 2022
Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
69
6
0
17 May 2022
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech
Rongjie Huang
Yi Ren
Jinglin Liu
Chenye Cui
Zhou Zhao
OODD
VLM
195
34
0
15 May 2022
Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
Bowen Shi
Abdel-rahman Mohamed
Wei-Ning Hsu
SSL
69
18
0
15 May 2022
The VoicePrivacy 2020 Challenge Evaluation Plan
N. Tomashenko
B. M. L. Srivastava
Xin Wang
Emmanuel Vincent
A. Nautsch
...
Nicholas W. D. Evans
J. Patino
J. Bonastre
Paul-Gauthier Noé
Massimiliano Todisco
84
44
0
14 May 2022
Collar-aware Training for Streaming Speaker Change Detection in Broadcast Speech
Joonas Kalda
Tanel Alumäe
50
3
0
14 May 2022
Gamified Speaker Comparison by Listening
Sandip Ghimire
Tomi Kinnunen
Rosa González Hautamäki
23
0
0
10 May 2022
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information
Chiyu Feng
Po-Chun Hsu
Hung-yi Lee
SSL
86
8
0
08 May 2022
VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution
Liangbin Xie
Honglun Zhang
Chao Dong
Ying Shan
CVBM
86
87
0
06 May 2022
SVTS: Scalable Video-to-Speech Synthesis
Rodrigo Mira
A. Haliassos
Stavros Petridis
Björn W. Schuller
Maja Pantic
71
35
0
04 May 2022
Baselines and Protocols for Household Speaker Recognition
A. Sholokhov
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
88
4
0
30 Apr 2022
Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype Contrast
Boqing Zhu
Kele Xu
Changjian Wang
Zheng Qin
Tao Sun
Huaimin Wang
Yuxing Peng
SSL
65
18
0
28 Apr 2022
ATST: Audio Representation Learning with Teacher-Student Transformer
Xian Li
Xiaofei Li
ViT
58
22
0
26 Apr 2022
Back-ends Selection for Deep Speaker Embeddings
Zhuo Li
Runqiu Xiao
Zi-qiang Zhang
Zhenduo Zhao
Wenchao Wang
Pengyuan Zhang
61
0
0
25 Apr 2022
Unifying Cosine and PLDA Back-ends for Speaker Verification
Zhiyuan Peng
Xuanji He
Ke Ding
Tan Lee
Guanglu Wan
60
4
0
22 Apr 2022
Conditional Injective Flows for Bayesian Imaging
AmirEhsan Khorashadizadeh
K. Kothari
Leonardo Salsi
Ali Aghababaei Harandi
Maarten V. de Hoop
Ivan Dokmanić
MedIm
78
16
0
15 Apr 2022
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
SSL
100
59
0
15 Apr 2022
The effect of speech pathology on automatic speaker verification -- a large-scale study
Soroosh Tayebi Arasteh
Tobias Weise
Maria Schuster
E. Noeth
Andreas Maier
Seung Hee Yang
83
9
0
13 Apr 2022
Structure-Aware Motion Transfer with Deformable Anchor Model
Jiale Tao
Biao Wang
Borun Xu
T. Ge
Yuning Jiang
Wen Li
Lixin Duan
87
41
0
11 Apr 2022
Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning
Salah Zaiem
Titouan Parcollet
S. Essid
SSL
41
6
0
08 Apr 2022
Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?
Qiongqiong Wang
Kong Aik Lee
Tianchi Liu
67
16
0
08 Apr 2022
Previous
1
2
3
...
10
11
12
...
21
22
23
Next