Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.08612
Cited By
v1
v2 (latest)
VoxCeleb: a large-scale speaker identification dataset
26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VoxCeleb: a large-scale speaker identification dataset"
50 / 1,111 papers shown
Title
Privacy-preserving Automatic Speaker Diarization
Francisco Teixeira
A. Abad
Bhiksha Raj
Isabel Trancoso
75
4
0
26 Oct 2022
In search of strong embedding extractors for speaker diarisation
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesung Huh
A. Brown
Youngki Kwon
Shinji Watanabe
Joon Son Chung
83
16
0
26 Oct 2022
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
Bowen Pang
Huan Zhao
Gaosheng Zhang
Xiaoyue Yang
Yanguo Sun
Li Zhang
Qing Wang
Linfu Xie
BDL
52
2
0
26 Oct 2022
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
SSL
105
33
0
26 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Evonne Lee
Guangzhi Sun
Chuxu Zhang
P. Woodland
51
1
0
24 Oct 2022
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation
Xiaoyu Liu
Xu Li
Joan Serrà
87
9
0
23 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Florian Lux
Julia Koch
Ngoc Thang Vu
107
23
0
21 Oct 2022
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning Pretrained Models in Speech Analysis
Florian Lux
Ching-Yi Chen
Ngoc Thang Vu
39
1
0
21 Oct 2022
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
65
6
0
20 Oct 2022
Risk of re-identification for shared clinical speech recordings
D. Wiepert
B. Malin
Joseph James Duffy
Rene L. Utianski
John L. Stricker
David T. Jones
Hugo Botha
58
0
0
18 Oct 2022
How to Leverage DNN-based speech enhancement for multi-channel speaker verification?
Sandipana Dowerah
Romain Serizel
D. Jouvet
Mohammad MohammadAmini
D. Matrouf
82
2
0
17 Oct 2022
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Themos Stafylakis
Ladislav Mošner
Sofoklis Kakouros
Oldrich Plchot
L. Burget
J. Černocký
SSL
60
10
0
15 Oct 2022
Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks
Run Wang
Jixing Ren
Boheng Li
Tianyi She
Wenhui Zhang
Liming Fang
Jing Chen
Chao Shen
Lina Wang
WIGM
79
19
0
14 Oct 2022
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Sarina Meyer
Pascal Tilli
Pavel Denisov
Florian Lux
Julia Koch
Ngoc Thang Vu
85
32
0
13 Oct 2022
Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar
Aolan Sun
Xulong Zhang
Tiandong Ling
Jianzong Wang
Ning Cheng
Jing Xiao
52
4
0
13 Oct 2022
Revisiting Self-Supervised Contrastive Learning for Facial Expression Recognition
Yuxuan Shu
Xiao Gu
Guangyao Yang
Benny Lo
SSL
107
18
0
08 Oct 2022
Compressing Video Calls using Synthetic Talking Heads
Madhav Agarwal
Anchit Gupta
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
49
11
0
07 Oct 2022
A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis
Yichen Han
Ya Li
Yingming Gao
Jinlong Xue
Songpo Wang
Lei Yang
34
2
0
07 Oct 2022
Audio-Visual Face Reenactment
Madhav Agarwal
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
DiffM
VGen
61
24
0
06 Oct 2022
PSVRF: Learning to restore Pitch-Shifted Voice without reference
Yangfu Li
Xiaodan Lin
Jiaxin Yang
55
0
0
06 Oct 2022
Geometry Driven Progressive Warping for One-Shot Face Animation
Yatao Zhong
F. Amjadi
Ilya Zharkov
3DH
CVBM
113
1
0
05 Oct 2022
Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward
Awais Khan
K. Malik
James Ryan
Mikul Saravanan
AAML
118
15
0
02 Oct 2022
An empirical study of weakly supervised audio tagging embeddings for general audio representations
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
62
1
0
30 Sep 2022
Motion and Appearance Adaptation for Cross-Domain Motion Transfer
Borun Xu
Biao Wang
Jinhong Deng
Jiale Tao
T. Ge
Yuning Jiang
Wen Li
Lixin Duan
117
9
0
29 Sep 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
85
3
0
28 Sep 2022
Motion Transformer for Unsupervised Image Animation
Jiale Tao
Biao Wang
T. Ge
Yuning Jiang
Wen Li
Lixin Duan
ViT
90
11
0
28 Sep 2022
StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment
Stella Bounareli
Christos Tzelepis
Vasileios Argyriou
Ioannis Patras
Georgios Tzimiropoulos
CVBM
88
18
0
27 Sep 2022
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
Jixun Yao
Qing Wang
Li Zhang
Pengcheng Guo
Yuhao Liang
Linfu Xie
PICV
80
17
0
24 Sep 2022
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
Mei-Shuo Chen
Z. Duan
105
11
0
23 Sep 2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022
Qutang Cai
Guoqiang Hong
Zhijian Ye
Ximin Li
Haizhou Li
119
7
0
23 Sep 2022
Gemino: Practical and Robust Neural Compression for Video Conferencing
Vibhaalakshmi Sivaraman
Pantea Karimi
Vedantha Venkatapathy
Mehrdad Khani Shirkoohi
Sadjad Fouladi
M. Alizadeh
F. Durand
Vivienne Sze
3DH
115
19
0
21 Sep 2022
FNeVR: Neural Volume Rendering for Face Animation
Bo-Wen Zeng
Bo-Ye Liu
Hong Li
Xuhui Liu
Jianzhuang Liu
Dapeng Chen
Wei Peng
Baochang Zhang
CVBM
3DH
121
28
0
21 Sep 2022
Pay Attention to Hard Trials
Lantian Li
Di Wang
Dong Wang
113
1
0
10 Sep 2022
Defend Data Poisoning Attacks on Voice Authentication
Ke Li
Cameron Baird
D. Lin
AAML
75
9
0
09 Sep 2022
Joint Speaker Encoder and Neural Back-end Model for Fully End-to-End Automatic Speaker Verification with Multiple Enrollment Utterances
Chang Zeng
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
69
6
0
01 Sep 2022
Computing with Hypervectors for Efficient Speaker Identification
Ping-Chen Huang
Denis Kleyko
J. Rabaey
Bruno A. Olshausen
P. Kanerva
81
2
0
28 Aug 2022
Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization
Dongmei Wang
Xiong Xiao
Naoyuki Kanda
Takuya Yoshioka
Jian Wu
87
29
0
27 Aug 2022
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages
Tahir Javed
Kaushal Bhogale
A. Raman
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
ELM
90
26
0
24 Aug 2022
Learning Branched Fusion and Orthogonal Projection for Face-Voice Association
M. S. Saeed
Shah Nawaz
M. H. Khan
S. Javed
Muhammad Haroon Yousaf
Alessio Del Bue
CVBM
74
4
0
22 Aug 2022
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
135
55
0
20 Aug 2022
Disentangled Speaker Representation Learning via Mutual Information Minimization
Sung Hwan Mun
Mingrui Han
Minchan Kim
Dongjune Lee
N. Kim
DRL
97
11
0
17 Aug 2022
Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment
Taewoo Kim
Chaeyeon Chung
Yoonseong Kim
S. Park
Kangyeol Kim
Jaegul Choo
3DH
69
21
0
16 Aug 2022
FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction and Expression Editing
Jingbo Zhang
Xiaoyu Li
Bo Liu
Can Wang
Jing Liao
3DH
CVBM
138
42
0
11 Aug 2022
Non-Contrastive Self-supervised Learning for Utterance-Level Information Extraction from Speech
Jaejin Cho
Jesús Villalba
Laureano Moro-Velazquez
Najim Dehak
SSL
90
18
0
10 Aug 2022
Robust Acoustic Domain Identification with its Application to Speaker Diarization
Kishore Kumar A
Shefali Waldekar
Md. Sahidullah
G. Saha
52
0
0
05 Aug 2022
Attention and DCT based Global Context Modeling for Text-independent Speaker Recognition
Wei Xia
John H. L. Hansen
65
4
0
04 Aug 2022
Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control
M. Doukas
Evangelos Ververas
V. Sharmanska
Stefanos Zafeiriou
CVBM
70
15
0
03 Aug 2022
The SJTU System for Short-duration Speaker Verification Challenge 2021
Bing Han
Zhengyang Chen
Zhikai Zhou
Y. Qian
19
7
0
03 Aug 2022
Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction
Bing Han
Zhengyang Chen
Y. Qian
61
32
0
03 Aug 2022
End-To-End Audiovisual Feature Fusion for Active Speaker Detection
Fiseha B. Tesema
Zheyuan Lin
Shiqiang Zhu
Wei Song
J. Gu
Hong-Chuan Wu
42
4
0
27 Jul 2022
Previous
1
2
3
...
9
10
11
...
21
22
23
Next