ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.05622
  4. Cited By
VoxCeleb2: Deep Speaker Recognition

VoxCeleb2: Deep Speaker Recognition

14 June 2018
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
ArXivPDFHTML

Papers citing "VoxCeleb2: Deep Speaker Recognition"

50 / 773 papers shown
Title
A comprehensive study on self-supervised distillation for speaker
  representation learning
A comprehensive study on self-supervised distillation for speaker representation learning
Zhengyang Chen
Yao Qian
Bing Han
Y. Qian
Michael Zeng
SSL
39
17
0
28 Oct 2022
Speaker recognition with two-step multi-modal deep cleansing
Speaker recognition with two-step multi-modal deep cleansing
Ruijie Tao
Kong Aik Lee
Zhan Shi
Haizhou Li
NoLa
47
13
0
28 Oct 2022
Coverage-centric Coreset Selection for High Pruning Rates
Coverage-centric Coreset Selection for High Pruning Rates
Haizhong Zheng
Rui Liu
Fan Lai
Atul Prakash
33
53
0
28 Oct 2022
Toroidal Probabilistic Spherical Discriminant Analysis
Toroidal Probabilistic Spherical Discriminant Analysis
Anna Silnova
Niko Brummer
Albert Swart
L. Burget
33
2
0
27 Oct 2022
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse
  Positive Pairs
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs
Ruijie Tao
Kong Aik Lee
Rohan Kumar Das
Ville Hautamaki
Haizhou Li
SSL
29
8
0
27 Oct 2022
Privacy-preserving Automatic Speaker Diarization
Privacy-preserving Automatic Speaker Diarization
Francisco Teixeira
A. Abad
Bhiksha Raj
Isabel Trancoso
27
4
0
26 Oct 2022
In search of strong embedding extractors for speaker diarisation
In search of strong embedding extractors for speaker diarisation
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesung Huh
A. Brown
Youngki Kwon
Shinji Watanabe
Joon Son Chung
44
16
0
26 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Evonne Lee
Guangzhi Sun
C. Zhang
P. Woodland
27
1
0
24 Oct 2022
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker
  Embeddings for Target Speaker Separation
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation
Xiaoyu Liu
Xu Li
Joan Serrà
44
9
0
23 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Florian Lux
Julia Koch
Ngoc Thang Vu
38
22
0
21 Oct 2022
Large-scale learning of generalised representations for speaker
  recognition
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
31
6
0
20 Oct 2022
How to Boost Face Recognition with StyleGAN?
How to Boost Face Recognition with StyleGAN?
Artem Sevastopolsky
Yury Malkov
N. Durasov
L. Verdoliva
Matthias Nießner
PICV
28
13
0
18 Oct 2022
Risk of re-identification for shared clinical speech recordings
Risk of re-identification for shared clinical speech recordings
D. Wiepert
B. Malin
Joseph James Duffy
Rene L. Utianski
John L. Stricker
David T. Jones
Hugo Botha
40
0
0
18 Oct 2022
How to Leverage DNN-based speech enhancement for multi-channel speaker
  verification?
How to Leverage DNN-based speech enhancement for multi-channel speaker verification?
Sandipana Dowerah
Romain Serizel
D. Jouvet
Mohammad MohammadAmini
D. Matrouf
34
0
0
17 Oct 2022
Extracting speaker and emotion information from self-supervised speech
  models via channel-wise correlations
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Themos Stafylakis
Ladislav Mošner
Sofoklis Kakouros
Oldrich Plchot
L. Burget
J. Černocký
SSL
40
8
0
15 Oct 2022
Anonymizing Speech with Generative Adversarial Networks to Preserve
  Speaker Privacy
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Sarina Meyer
Pascal Tilli
Pavel Denisov
Florian Lux
Julia Koch
Ngoc Thang Vu
23
31
0
13 Oct 2022
Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data
  for Zero-Shot Multi-Speaker Text-to-Speech
Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech
Byoung Jin Choi
Myeonghun Jeong
Minchan Kim
Sung Hwan Mun
N. Kim
DiffM
27
5
0
12 Oct 2022
Controllable Radiance Fields for Dynamic Face Synthesis
Controllable Radiance Fields for Dynamic Face Synthesis
Peiye Zhuang
Liqian Ma
Oluwasanmi Koyejo
A. Schwing
CVBM
3DH
18
11
0
11 Oct 2022
Revisiting Self-Supervised Contrastive Learning for Facial Expression
  Recognition
Revisiting Self-Supervised Contrastive Learning for Facial Expression Recognition
Yuxuan Shu
Xiao Gu
Guangyao Yang
Benny Lo
SSL
54
17
0
08 Oct 2022
PSVRF: Learning to restore Pitch-Shifted Voice without reference
Yangfu Li
Xiaodan Lin
Jiaxin Yang
19
0
0
06 Oct 2022
Geometry Driven Progressive Warping for One-Shot Face Animation
Geometry Driven Progressive Warping for One-Shot Face Animation
Yatao Zhong
F. Amjadi
Ilya Zharkov
3DH
CVBM
21
1
0
05 Oct 2022
Learning Video-independent Eye Contact Segmentation from In-the-Wild
  Videos
Learning Video-independent Eye Contact Segmentation from In-the-Wild Videos
Tianyi Wu
Yusuke Sugano
14
0
0
05 Oct 2022
Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental
  analysis of generalizability, open challenges, and the way forward
Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward
Awais Khan
K. Malik
James Ryan
Mikul Saravanan
AAML
48
11
0
02 Oct 2022
Deepfake audio detection by speaker verification
Deepfake audio detection by speaker verification
Alessandro Pianese
D. Cozzolino
Giovanni Poggi
L. Verdoliva
38
39
0
28 Sep 2022
StyleSwap: Style-Based Generator Empowers Robust Face Swapping
StyleSwap: Style-Based Generator Empowers Robust Face Swapping
Zhi-liang Xu
Hang Zhou
Zhibin Hong
Ziwei Liu
Jiaming Liu
Zhizhi Guo
Junyu Han
Jingtuo Liu
Errui Ding
Jingdong Wang
CVBM
39
77
0
27 Sep 2022
StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face
  Reenactment
StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment
Stella Bounareli
Christos Tzelepis
Vasileios Argyriou
Ioannis Patras
Georgios Tzimiropoulos
CVBM
27
17
0
27 Sep 2022
Unsupervised active speaker detection in media content using cross-modal
  information
Unsupervised active speaker detection in media content using cross-modal information
Rahul Sharma
Shrikanth Narayanan
24
3
0
24 Sep 2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge
  2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022
Qutang Cai
Guoqiang Hong
Zhijian Ye
Ximin Li
Haizhou Li
38
7
0
23 Sep 2022
The SpeakIn System Description for CNSRC2022
The SpeakIn System Description for CNSRC2022
Yu Zheng
Yihao Chen
Jinghan Peng
Yajun Zhang
Min Liu
Minqiang Xu
26
2
0
22 Sep 2022
Gemino: Practical and Robust Neural Compression for Video Conferencing
Gemino: Practical and Robust Neural Compression for Video Conferencing
Vibhaalakshmi Sivaraman
Pantea Karimi
Vedantha Venkatapathy
Mehrdad Khani Shirkoohi
Sadjad Fouladi
M. Alizadeh
F. Durand
Vivienne Sze
3DH
44
17
0
21 Sep 2022
FNeVR: Neural Volume Rendering for Face Animation
FNeVR: Neural Volume Rendering for Face Animation
Bo-Wen Zeng
Bo-Ye Liu
Hong Li
Xuhui Liu
Jianzhuang Liu
Dapeng Chen
Wei Peng
Baochang Zhang
CVBM
3DH
48
26
0
21 Sep 2022
Relaxed Attention for Transformer Models
Relaxed Attention for Transformer Models
Timo Lohrenz
Björn Möller
Zhengyang Li
Tim Fingscheidt
KELM
29
11
0
20 Sep 2022
SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
SJTU-AISPEECH System for VoxCeleb Speaker Recognition Challenge 2022
Zhengyang Chen
Bing Han
Xu Xiang
Houjun Huang
Bei Liu
Y. Qian
17
8
0
19 Sep 2022
AutoLV: Automatic Lecture Video Generator
AutoLV: Automatic Lecture Video Generator
Wen Wang
Yang Song
Sanjay Jha
VGen
18
3
0
19 Sep 2022
Pay Attention to Hard Trials
Pay Attention to Hard Trials
Lantian Li
Di Wang
Dong Wang
48
1
0
10 Sep 2022
Learning Audio-Visual embedding for Person Verification in the Wild
Learning Audio-Visual embedding for Person Verification in the Wild
Peiwen Sun
Shanshan Zhang
Zishan Liu
Yougen Yuan
Tao Zhang
Honggang Zhang
Pengfei Hu
30
4
0
09 Sep 2022
IndicSUPERB: A Speech Processing Universal Performance Benchmark for
  Indian languages
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages
Tahir Javed
Kaushal Bhogale
A. Raman
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
ELM
30
20
0
24 Aug 2022
Learning in Audio-visual Context: A Review, Analysis, and New
  Perspective
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
46
55
0
20 Aug 2022
Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors
Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors
Sindhu B. Hegde
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
CVBM
16
1
0
17 Aug 2022
Disentangled Speaker Representation Learning via Mutual Information
  Minimization
Disentangled Speaker Representation Learning via Mutual Information Minimization
Sung Hwan Mun
Mingrui Han
Minchan Kim
Dongjune Lee
N. Kim
DRL
41
9
0
17 Aug 2022
Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle
  Transfer via Local-Style-Aware Hair Alignment
Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment
Taewoo Kim
Chaeyeon Chung
Yoonseong Kim
S. Park
Kangyeol Kim
Jaegul Choo
3DH
39
20
0
16 Aug 2022
Non-Contrastive Self-supervised Learning for Utterance-Level Information
  Extraction from Speech
Non-Contrastive Self-supervised Learning for Utterance-Level Information Extraction from Speech
Jaejin Cho
Jesús Villalba
Laureano Moro Velázquez
Najim Dehak
SSL
39
18
0
10 Aug 2022
Robust Acoustic Domain Identification with its Application to Speaker
  Diarization
Robust Acoustic Domain Identification with its Application to Speaker Diarization
Kishore Kumar A
Shefali Waldekar
Md. Sahidullah
G. Saha
24
0
0
05 Aug 2022
Video Manipulations Beyond Faces: A Dataset with Human-Machine Analysis
Video Manipulations Beyond Faces: A Dataset with Human-Machine Analysis
Trisha Mittal
Ritwik Sinha
Viswanathan Swaminathan
John Collomosse
Tianyi Zhou
30
9
0
26 Jul 2022
Multimodal Emotion Recognition with Modality-Pairwise Unsupervised
  Contrastive Loss
Multimodal Emotion Recognition with Modality-Pairwise Unsupervised Contrastive Loss
Riccardo Franceschini
Enrico Fini
Cigdem Beyan
Alessandro Conti
F. Arrigoni
Elisa Ricci
SSL
OffRL
34
16
0
23 Jul 2022
Telepresence Video Quality Assessment
Telepresence Video Quality Assessment
Zhenqiang Ying
Deepti Ghadiyaram
A. Bovik
16
5
0
20 Jul 2022
Controllable Data Generation by Deep Learning: A Review
Controllable Data Generation by Deep Learning: A Review
Shiyu Wang
Yuanqi Du
Xiaojie Guo
Bo Pan
Zhaohui Qin
Liang Zhao
33
28
0
19 Jul 2022
Multi-channel target speech enhancement based on ERB-scaled spatial
  coherence features
Multi-channel target speech enhancement based on ERB-scaled spatial coherence features
Yicheng Hsu
Yonghan Lee
M. Bai
25
1
0
17 Jul 2022
MegaPortraits: One-shot Megapixel Neural Head Avatars
MegaPortraits: One-shot Megapixel Neural Head Avatars
Nikita Drobyshev
Jenya Chelishev
Taras Khakhulin
Aleksei Ivakhnenko
Victor Lempitsky
Egor Zakharov
28
108
0
15 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer
  to Unlabeled Modality
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
27
41
0
14 Jul 2022
Previous
123...101112...141516
Next