Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.08612
Cited By
v1
v2 (latest)
VoxCeleb: a large-scale speaker identification dataset
26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VoxCeleb: a large-scale speaker identification dataset"
50 / 1,111 papers shown
Title
PV3D: A 3D Generative Model for Portrait Video Generation
Eric Xu
Jianfeng Zhang
Jun Hao Liew
Wenqing Zhang
Song Bai
Jiashi Feng
Mike Zheng Shou
VGen
84
21
0
13 Dec 2022
GPU-accelerated Guided Source Separation for Meeting Transcription
Desh Raj
Daniel Povey
Sanjeev Khudanpur
76
40
0
10 Dec 2022
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers
Yasheng Sun
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Zhibin Hong
Jingtuo Liu
Errui Ding
Jingdong Wang
Ziwei Liu
Koike Hideki
62
34
0
09 Dec 2022
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
Zhentao Yu
Zixin Yin
Deyu Zhou
Duomin Wang
Finn Wong
Baoyuan Wang
DiffM
98
38
0
07 Dec 2022
DREAM: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads
Seah Kim
Hyoukjun Kwon
Jinook Song
Jihyuck Jo
Yu-Hsin Chen
Liangzhen Lai
Vikas Chandra
AI4TS
69
11
0
07 Dec 2022
Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition
Zhiyuan Peng
Xuanji He
Ke Ding
Tan Lee
Guanglu Wan
57
6
0
06 Dec 2022
Covariance Regularization for Probabilistic Linear Discriminant Analysis
Zhiyuan Peng
Mingjie Shao
Xuanji He
Xu Li
Tan Lee
Ke Ding
Guanglu Wan
50
1
0
06 Dec 2022
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
Gyeongman Kim
Hajin Shim
Hyunsung Kim
Yunjey Choi
Junho Kim
Eunho Yang
DiffM
VGen
76
32
0
06 Dec 2022
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
54
17
0
06 Dec 2022
Topological Data Analysis for Speech Processing
Eduard Tulchinskii
Kristian Kuznetsov
Laida Kushnareva
D. Cherniavskii
S. Barannikov
Irina Piontkovskaya
Sergey I. Nikolenko
Evgeny Burnaev
70
6
0
30 Nov 2022
MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages
Yue Li
Li Zhang
Na Wang
Jie Liu
Linfu Xie
82
0
0
30 Nov 2022
Evaluating and reducing the distance between synthetic and real speech distributions
Christoph Minixhofer
Ondˇrej Klejch
P. Bell
82
8
0
29 Nov 2022
VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
K. Cheng
Xiaodong Cun
Yong Zhang
Menghan Xia
Fei Yin
Mingrui Zhu
Xuanxia Wang
Jue Wang
Nan Wang
CVBM
76
106
0
27 Nov 2022
Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent Portrait Synthesis from Monocular Image
Yu Deng
Baoyuan Wang
H. Shum
3DH
98
11
0
25 Nov 2022
Pose-disentangled Contrastive Learning for Self-supervised Facial Representation
Y. Liu
Wenbin Wang
Yibing Zhan
Shaoze Feng
Li-Yu Daisy Liu
Zhe Chen
SSL
69
13
0
24 Nov 2022
A new Speech Feature Fusion method with cross gate parallel CNN for Speaker Recognition
Jiacheng Zhang
Wenyi Yan
Ye Zhang
33
2
0
24 Nov 2022
Semantic-aware One-shot Face Re-enactment with Dense Correspondence Estimation
Yunfan Liu
Qi Li
Zhen Sun
Tieniu Tan
CVBM
61
0
0
23 Nov 2022
Complex-Valued Time-Frequency Self-Attention for Speech Dereverberation
Vinay Kothapally
John H. L. Hansen
48
9
0
22 Nov 2022
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Wenxuan Zhang
Xiaodong Cun
Xuan Wang
Yong Zhang
Xiaodong Shen
Yu-Xiao Guo
Ying Shan
Fei Wang
VGen
99
256
0
22 Nov 2022
Robust Training for Speaker Verification against Noisy Labels
Zhihua Fang
Liang He
Hanhan Ma
Xiao-Min Guo
Lin Li
NoLa
80
3
0
22 Nov 2022
Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy Environments
Dominik Wagner
Ilja Baumann
Sebastian P. Bayerl
Korbinian Riedhammer
Tobias Bocklet
77
2
0
16 Nov 2022
Towards an objective characterization of an individual's facial movements using Self-Supervised Person-Specific-Models
Yanis Tazi
M. Berger
W. Freiwald
114
0
0
15 Nov 2022
Multi-Label Training for Text-Independent Speaker Identification
Yuqi Xue
70
0
0
14 Nov 2022
Towards A Unified Conformer Structure: from ASR to ASV Task
Dexin Liao
Tao Jiang
Feng Wang
Lin Li
Q. Hong
91
10
0
14 Nov 2022
Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing Countermeasure Against Codec Variabilities
Yikang Wang
Xingming Wang
Hiromitsu Nishizaki
Ming Li
52
6
0
12 Nov 2022
Speech separation with large-scale self-supervised learning
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yu-Huan Wu
Xiaofei Wang
Takuya Yoshioka
Jinyu Li
S. Sivasankaran
Sefik Emre Eskimez
83
15
0
09 Nov 2022
Pushing the limits of self-supervised speaker verification using regularized distillation framework
Yafeng Chen
Siqi Zheng
Haibo Wang
Luyao Cheng
Qian Chen
75
27
0
08 Nov 2022
High-resolution embedding extractor for speaker diarisation
Hee-Soo Heo
Youngki Kwon
Bong-Jin Lee
You Jin Kim
Jee-weon Jung
70
5
0
08 Nov 2022
SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers
Alessandro Arezzo
Stefano Berretti
ViT
48
17
0
04 Nov 2022
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
Ju-ho Kim
Ju-Sung Heo
Hyun-Seo Shin
Chanmann Lim
Ha-Jin Yu
28
5
0
04 Nov 2022
Dynamic Kernels and Channel Attention for Low Resource Speaker Verification
A. Ollerenshaw
Md. Asif Jalal
Thomas Hain
21
0
0
03 Nov 2022
SLICER: Learning universal audio representations using low-resource self-supervised pre-training
Ashish Seth
Sreyan Ghosh
S. Umesh
Tianyi Zhou
SSL
80
3
0
02 Nov 2022
MAST: Multiscale Audio Spectrogram Transformers
Sreyan Ghosh
Ashish Seth
S. Umesh
Tianyi Zhou
83
3
0
02 Nov 2022
LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker Verification
Xingqi Chen
Jie Wang
Xiaoli Zhang
Weiqiang Zhang
Kunde Yang
AAML
116
7
0
02 Nov 2022
Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Zhengyang Chen
Bing Han
Xu Xiang
Houjun Huang
Bei Liu
Y. Qian
91
14
0
02 Nov 2022
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings
Mohan Shi
Jie Zhang
Zhihao Du
Fan Yu
Qian Chen
Shiliang Zhang
Lirong Dai
97
4
0
01 Nov 2022
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Zili Huang
Desh Raj
Leibny Paola García-Perera
Sanjeev Khudanpur
155
29
0
01 Nov 2022
Disentangled representation learning for multilingual speaker recognition
KiHyun Nam
You-kyong. Kim
Jaesung Huh
Hee-Soo Heo
Jee-weon Jung
Joon Son Chung
93
9
0
01 Nov 2022
Model Compression for DNN-based Speaker Verification Using Weight Quantization
Jingyu Li
W. Liu
Zhaoyang Zhang
Jiong Wang
Tan Lee
MQ
59
3
0
31 Oct 2022
Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification
Jingyu Li
Yusheng Tian
Tan Lee
55
9
0
31 Oct 2022
Combining Automatic Speaker Verification and Prosody Analysis for Synthetic Speech Detection
L. Attorresi
Davide Salvi
Clara Borrelli
Paolo Bestagini
Stefano Tubaro
111
24
0
31 Oct 2022
Application of Knowledge Distillation to Multi-task Speech Representation Learning
Mine Kerpicci
V. Nguyen
Shuhua Zhang
Erik M. Visser
59
0
0
29 Oct 2022
Universal speaker recognition encoders for different speech segments duration
Sergey Novoselov
V. Volokhov
G. Lavrentyeva
24
2
0
28 Oct 2022
Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction
Ming Cheng
Weiqing Wang
Yucong Zhang
Xiaoyi Qin
Ming Li
VLM
102
38
0
28 Oct 2022
Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters
Junyi Peng
Themos Stafylakis
Rongzhi Gu
Oldvrich Plchot
Ladislav Movsner
Lukávs Burget
JanHonza'' vCernocký
130
22
0
28 Oct 2022
Laugh Betrays You? Learning Robust Speaker Representation From Speech Containing Non-Verbal Fragments
Yuke Lin
Xiaoyi Qin
Huahua Cui
Zhenyi Zhu
Ming Li
51
1
0
28 Oct 2022
A comprehensive study on self-supervised distillation for speaker representation learning
Zhengyang Chen
Yao Qian
Bing Han
Y. Qian
Michael Zeng
SSL
132
17
0
28 Oct 2022
Speaker recognition with two-step multi-modal deep cleansing
Ruijie Tao
Kong Aik Lee
Zhan Shi
Haizhou Li
NoLa
77
13
0
28 Oct 2022
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs
Ruijie Tao
Kong Aik Lee
Rohan Kumar Das
Ville Hautamaki
Haizhou Li
SSL
90
12
0
27 Oct 2022
V-Cloak: Intelligibility-, Naturalness- & Timbre-Preserving Real-Time Voice Anonymization
Jiangyi Deng
Fei Teng
Yanjiao Chen
Xiaofu Chen
Zhaohui Wang
Wenyuan Xu
62
11
0
27 Oct 2022
Previous
1
2
3
...
8
9
10
...
21
22
23
Next