Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.08612
Cited By
v1
v2 (latest)
VoxCeleb: a large-scale speaker identification dataset
26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VoxCeleb: a large-scale speaker identification dataset"
50 / 1,111 papers shown
Title
Configurable Privacy-Preserving Automatic Speech Recognition
Ranya Aloufi
Hamed Haddadi
David E. Boyle
67
10
0
01 Apr 2021
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
Adam Polyak
Yossi Adi
Jade Copet
Eugene Kharitonov
Kushal Lakhotia
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
130
318
0
01 Apr 2021
Improved Meta-Learning Training for Speaker Verification
Yafeng Chen
Wu Guo
Bin Gu
118
7
0
29 Mar 2021
Scalable and Efficient Neural Speech Coding: A Hybrid Design
Kai Zhen
Jongmo Sung
Mi Suk Lee
Seung-Wha Beack
Minje Kim
95
14
0
27 Mar 2021
EfficientTDNN: Efficient Architecture Search for Speaker Recognition
Rui Wang
Zhihua Wei
Haoran Duan
S. Ji
Yang Long
Zhenhou Hong
138
18
0
25 Mar 2021
PriorityCut: Occlusion-guided Regularization for Warp-based Image Animation
Wai Ting Cheung
Gyeongsu Chae
VGen
108
2
0
22 Mar 2021
USTC-NELSLIP System Description for DIHARD-III Challenge
Yuxuan Wang
Maokui He
Shutong Niu
Lei Sun
Tian Gao
Xin Fang
Jia Pan
Jun Du
Chin-Hui Lee
76
30
0
19 Mar 2021
KoDF: A Large-scale Korean DeepFake Detection Dataset
Patrick Kwon
J. You
Gyuhyeon Nam
Sungwoo Park
Gyeongsu Chae
109
104
0
18 Mar 2021
Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning
Siyang Yuan
Pengyu Cheng
Ruiyi Zhang
Weituo Hao
Zhe Gan
Lawrence Carin
DRL
66
61
0
17 Mar 2021
Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association
Peisong Wen
Qianqian Xu
Yangbangyan Jiang
Zhiyong Yang
Yuan He
Qingming Huang
CVBM
61
33
0
12 Mar 2021
Learning spectro-temporal representations of complex sounds with parameterized neural networks
Rachid Riad
Julien Karadayi
Anne-Catherine Bachoud-Lévi
Emmanuel Dupoux
49
7
0
12 Mar 2021
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
SSL
107
179
0
11 Mar 2021
EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition
Maurice Gerczuk
Shahin Amiriparian
Sandra Ottl
Björn Schuller
95
59
0
10 Mar 2021
Am I a Real or Fake Celebrity? Measuring Commercial Face Recognition Web APIs under Deepfake Impersonation Attack
Shahroz Tariq
Sowon Jeon
Simon S. Woo
73
25
0
01 Mar 2021
Learnable MFCCs for Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
54
17
0
20 Feb 2021
AudioVisual Speech Synthesis: A brief literature review
Efthymios Georgiou
Athanasios Katsamanis
25
0
0
18 Feb 2021
Biometrics in the Era of COVID-19: Challenges and Opportunities
M. Gomez-Barrero
P. Drozdowski
Christian Rathgeb
J. Patino
Massimiliano Todisco
A. Nautsch
Naser Damer
Jannier Priesnitz
Nicholas W. D. Evans
Christoph Busch
79
54
0
18 Feb 2021
Adversarial defense for automatic speaker verification by cascaded self-supervised learning models
Haibin Wu
Xu Li
Andy T. Liu
Zhiyong Wu
Helen Meng
Hung-yi Lee
AAML
86
41
0
14 Feb 2021
A Multi-View Approach To Audio-Visual Speaker Verification
Leda Sari
Kritika Singh
Jiatong Zhou
Lorenzo Torresani
Nayan Singhal
Yatharth Saraf
123
38
0
11 Feb 2021
ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech
A. Nautsch
Xin Wang
Nicholas W. D. Evans
Tomi Kinnunen
Ville Vestman
Massimiliano Todisco
Héctor Delgado
Md. Sahidullah
Junichi Yamagishi
Kong Aik Lee
194
154
0
11 Feb 2021
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning
Giuseppe Ruggiero
Enrico Zovato
Luigi Di Caro
V. Pollet
DiffM
63
10
0
10 Feb 2021
The DKU-Duke-Lenovo System Description for the Third DIHARD Speech Diarization Challenge
Weiqing Wang
Qingjian Lin
Danwei Cai
Lin Yang
Ming Li
35
8
0
06 Feb 2021
Understanding the Tradeoffs in Client-side Privacy for Downstream Speech Tasks
Peter Wu
Paul Pu Liang
Jiatong Shi
Ruslan Salakhutdinov
Shinji Watanabe
Louis-Philippe Morency
63
9
0
22 Jan 2021
LEAF: A Learnable Frontend for Audio Classification
Neil Zeghidour
O. Teboul
Félix de Chaumont Quitry
Marco Tagliasacchi
VLM
AAML
134
148
0
21 Jan 2021
MAAS: Multi-modal Assignation for Active Speaker Detection
Juan Carlos León Alcázar
Fabian Caba Heilbron
Ali K. Thabet
Guohao Li
130
52
0
11 Jan 2021
FakeBuster: A DeepFakes Detection Tool for Video Conferencing Scenarios
V. Mehta
Parul Gupta
Ramanathan Subramanian
Abhinav Dhall
CVBM
67
22
0
09 Jan 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
247
202
0
08 Jan 2021
What all do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure
Jui Shah
Yaman Kumar Singla
Changyou Chen
R. Shah
93
81
0
02 Jan 2021
Generalized Operating Procedure for Deep Learning: an Unconstrained Optimal Design Perspective
Shen Chen
Mingwei Zhang
Jiamin Cui
Wei Yao
CVBM
49
0
0
31 Dec 2020
Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks
Federico Landini
Jan Profant
Mireia Díez
L. Burget
287
209
0
29 Dec 2020
A Principle Solution for Enroll-Test Mismatch in Speaker Recognition
Lantian Li
Dong Wang
Jiawen Kang
Renyu Wang
Jingqian Wu
Zhendong Gao
Xiao Chen
65
7
0
23 Dec 2020
CN-Celeb: multi-genre speaker recognition
Lantian Li
Ruiqi Liu
Jiawen Kang
Yue Fan
Hao Cui
Yunqi Cai
Ravichander Vipperla
Tianshi Zheng
Dong Wang
101
123
0
23 Dec 2020
Multi-stream Convolutional Neural Network with Frequency Selection for Robust Speaker Verification
Wei Yao
Shen Chen
Jiamin Cui
Yaolin Lou
76
6
0
21 Dec 2020
Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording
Cong Han
Yi Luo
Chenda Li
Tianyan Zhou
K. Kinoshita
...
Marc Delcroix
Hakan Erdogan
J. Hershey
N. Mesgarani
Zhuo Chen
58
8
0
17 Dec 2020
HeadGAN: One-shot Neural Head Synthesis and Editing
M. Doukas
Stefanos Zafeiriou
V. Sharmanska
CVBM
3DH
62
129
0
15 Dec 2020
Few Shot Adaptive Normalization Driven Multi-Speaker Speech Synthesis
Neeraj Kumar
Srishti Goel
Ankur Narang
Brejesh Lall
68
5
0
14 Dec 2020
DEAAN: Disentangled Embedding and Adversarial Adaptation Network for Robust Speaker Representation Learning
Mufan Sang
Wei Xia
John H. L. Hansen
OOD
DRL
94
23
0
12 Dec 2020
VoxSRC 2020: The Second VoxCeleb Speaker Recognition Challenge
Arsha Nagrani
Joon Son Chung
Jaesung Huh
Andrew Brown
Ernesto Coto
Weidi Xie
Mitchell McLaren
D. Reynolds
Andrew Zisserman
76
74
0
12 Dec 2020
Exploring wav2vec 2.0 on speaker verification and language identification
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
159
203
0
11 Dec 2020
Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation
Paul-Gauthier Noé
Mohammad MohammadAmini
D. Matrouf
Titouan Parcollet
Andreas Nautsch
J. Bonastre
100
28
0
08 Dec 2020
A Study of Few-Shot Audio Classification
Piper Wolters
Chris Careaga
Brian Hutchinson
Lauren A. Phillips
107
10
0
02 Dec 2020
Joint gender and age estimation based on speech signals using x-vectors and transfer learning
Damian Kwaśny
Daria Hemmerling
31
11
0
02 Dec 2020
A Unified Deep Speaker Embedding Framework for Mixed-Bandwidth Speech Data
Weicheng Cai
Ming Li
144
4
0
01 Dec 2020
Low Bandwidth Video-Chat Compression using Deep Generative Models
Maxime Oquab
Pierre Stock
Oran Gafni
Daniel Haziza
Tao Xu
...
Yana Hasson
Patrick Labatut
Bobo Bose-Kolanu
T. Peyronel
Camille Couprie
3DH
76
44
0
01 Dec 2020
Look who's not talking
Youngki Kwon
Hee-Soo Heo
Jaesung Huh
Bong-Jin Lee
Joon Son Chung
36
29
0
30 Nov 2020
How Far Are We from Robust Voice Conversion: A Survey
Tzu-hsien Huang
Jheng-hao Lin
Chien-yu Huang
Hung-yi Lee
96
25
0
24 Nov 2020
Exploring Voice Conversion based Data Augmentation in Text-Dependent Speaker Verification
Xiaoyi Qin
Yaogen Yang
Lin Yang
Xuyang Wang
Junjie Wang
Ming Li
49
0
0
21 Nov 2020
FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances
Ali Shahin Shamsabadi
Francisco Teixeira
A. Abad
Bhiksha Raj
Andrea Cavallaro
Isabel Trancoso
AAML
62
30
0
17 Nov 2020
Image Animation with Perturbed Masks
Yoav Shalev
Lior Wolf
DiffM
VGen
19
8
0
13 Nov 2020
Supervised attention for speaker recognition
Seong Min Kye
Joon Son Chung
Hoirin Kim
94
11
0
10 Nov 2020
Previous
1
2
3
...
15
16
17
...
21
22
23
Next