ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset
v1v2 (latest)

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,111 papers shown
Title
Privacy-preserving Automatic Speaker Diarization
Privacy-preserving Automatic Speaker Diarization
Francisco Teixeira
A. Abad
Bhiksha Raj
Isabel Trancoso
75
4
0
26 Oct 2022
In search of strong embedding extractors for speaker diarisation
In search of strong embedding extractors for speaker diarisation
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesung Huh
A. Brown
Youngki Kwon
Shinji Watanabe
Joon Son Chung
83
16
0
26 Oct 2022
TSUP Speaker Diarization System for Conversational Short-phrase Speaker
  Diarization Challenge
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
Bowen Pang
Huan Zhao
Gaosheng Zhang
Xiaoyue Yang
Yanguo Sun
Li Zhang
Qing Wang
Linfu Xie
BDL
52
2
0
26 Oct 2022
Masked Modeling Duo: Learning Representations by Encouraging Both
  Networks to Model the Input
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
SSL
105
33
0
26 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Evonne Lee
Guangzhi Sun
Chuxu Zhang
P. Woodland
51
1
0
24 Oct 2022
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker
  Embeddings for Target Speaker Separation
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation
Xiaoyu Liu
Xu Li
Joan Serrà
87
9
0
23 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Florian Lux
Julia Koch
Ngoc Thang Vu
107
23
0
21 Oct 2022
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning
  Pretrained Models in Speech Analysis
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning Pretrained Models in Speech Analysis
Florian Lux
Ching-Yi Chen
Ngoc Thang Vu
39
1
0
21 Oct 2022
Large-scale learning of generalised representations for speaker
  recognition
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
65
6
0
20 Oct 2022
Risk of re-identification for shared clinical speech recordings
Risk of re-identification for shared clinical speech recordings
D. Wiepert
B. Malin
Joseph James Duffy
Rene L. Utianski
John L. Stricker
David T. Jones
Hugo Botha
58
0
0
18 Oct 2022
How to Leverage DNN-based speech enhancement for multi-channel speaker
  verification?
How to Leverage DNN-based speech enhancement for multi-channel speaker verification?
Sandipana Dowerah
Romain Serizel
D. Jouvet
Mohammad MohammadAmini
D. Matrouf
82
2
0
17 Oct 2022
Extracting speaker and emotion information from self-supervised speech
  models via channel-wise correlations
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Themos Stafylakis
Ladislav Mošner
Sofoklis Kakouros
Oldrich Plchot
L. Burget
J. Černocký
SSL
60
10
0
15 Oct 2022
Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural
  Networks
Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks
Run Wang
Jixing Ren
Boheng Li
Tianyi She
Wenhui Zhang
Liming Fang
Jing Chen
Chao Shen
Lina Wang
WIGM
79
19
0
14 Oct 2022
Anonymizing Speech with Generative Adversarial Networks to Preserve
  Speaker Privacy
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Sarina Meyer
Pascal Tilli
Pavel Denisov
Florian Lux
Julia Koch
Ngoc Thang Vu
85
32
0
13 Oct 2022
Pre-Avatar: An Automatic Presentation Generation Framework Leveraging
  Talking Avatar
Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar
Aolan Sun
Xulong Zhang
Tiandong Ling
Jianzong Wang
Ning Cheng
Jing Xiao
52
4
0
13 Oct 2022
Revisiting Self-Supervised Contrastive Learning for Facial Expression
  Recognition
Revisiting Self-Supervised Contrastive Learning for Facial Expression Recognition
Yuxuan Shu
Xiao Gu
Guangyao Yang
Benny Lo
SSL
107
18
0
08 Oct 2022
Compressing Video Calls using Synthetic Talking Heads
Compressing Video Calls using Synthetic Talking Heads
Madhav Agarwal
Anchit Gupta
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
49
11
0
07 Oct 2022
A Keypoint Based Enhancement Method for Audio Driven Free View Talking
  Head Synthesis
A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis
Yichen Han
Ya Li
Yingming Gao
Jinlong Xue
Songpo Wang
Lei Yang
34
2
0
07 Oct 2022
Audio-Visual Face Reenactment
Audio-Visual Face Reenactment
Madhav Agarwal
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
DiffMVGen
61
24
0
06 Oct 2022
PSVRF: Learning to restore Pitch-Shifted Voice without reference
Yangfu Li
Xiaodan Lin
Jiaxin Yang
55
0
0
06 Oct 2022
Geometry Driven Progressive Warping for One-Shot Face Animation
Geometry Driven Progressive Warping for One-Shot Face Animation
Yatao Zhong
F. Amjadi
Ilya Zharkov
3DHCVBM
113
1
0
05 Oct 2022
Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental
  analysis of generalizability, open challenges, and the way forward
Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward
Awais Khan
K. Malik
James Ryan
Mikul Saravanan
AAML
118
15
0
02 Oct 2022
An empirical study of weakly supervised audio tagging embeddings for
  general audio representations
An empirical study of weakly supervised audio tagging embeddings for general audio representations
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
62
1
0
30 Sep 2022
Motion and Appearance Adaptation for Cross-Domain Motion Transfer
Motion and Appearance Adaptation for Cross-Domain Motion Transfer
Borun Xu
Biao Wang
Jinhong Deng
Jiale Tao
T. Ge
Yuning Jiang
Wen Li
Lixin Duan
117
9
0
29 Sep 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
85
3
0
28 Sep 2022
Motion Transformer for Unsupervised Image Animation
Motion Transformer for Unsupervised Image Animation
Jiale Tao
Biao Wang
T. Ge
Yuning Jiang
Wen Li
Lixin Duan
ViT
90
11
0
28 Sep 2022
StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face
  Reenactment
StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment
Stella Bounareli
Christos Tzelepis
Vasileios Argyriou
Ioannis Patras
Georgios Tzimiropoulos
CVBM
88
18
0
27 Sep 2022
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
Jixun Yao
Qing Wang
Li Zhang
Pengcheng Guo
Yuhao Liang
Linfu Xie
PICV
80
17
0
24 Sep 2022
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on
  Pitch and Speed
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
Mei-Shuo Chen
Z. Duan
105
11
0
23 Sep 2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge
  2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022
Qutang Cai
Guoqiang Hong
Zhijian Ye
Ximin Li
Haizhou Li
119
7
0
23 Sep 2022
Gemino: Practical and Robust Neural Compression for Video Conferencing
Gemino: Practical and Robust Neural Compression for Video Conferencing
Vibhaalakshmi Sivaraman
Pantea Karimi
Vedantha Venkatapathy
Mehrdad Khani Shirkoohi
Sadjad Fouladi
M. Alizadeh
F. Durand
Vivienne Sze
3DH
115
19
0
21 Sep 2022
FNeVR: Neural Volume Rendering for Face Animation
FNeVR: Neural Volume Rendering for Face Animation
Bo-Wen Zeng
Bo-Ye Liu
Hong Li
Xuhui Liu
Jianzhuang Liu
Dapeng Chen
Wei Peng
Baochang Zhang
CVBM3DH
121
28
0
21 Sep 2022
Pay Attention to Hard Trials
Pay Attention to Hard Trials
Lantian Li
Di Wang
Dong Wang
113
1
0
10 Sep 2022
Defend Data Poisoning Attacks on Voice Authentication
Defend Data Poisoning Attacks on Voice Authentication
Ke Li
Cameron Baird
D. Lin
AAML
75
9
0
09 Sep 2022
Joint Speaker Encoder and Neural Back-end Model for Fully End-to-End
  Automatic Speaker Verification with Multiple Enrollment Utterances
Joint Speaker Encoder and Neural Back-end Model for Fully End-to-End Automatic Speaker Verification with Multiple Enrollment Utterances
Chang Zeng
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
69
6
0
01 Sep 2022
Computing with Hypervectors for Efficient Speaker Identification
Computing with Hypervectors for Efficient Speaker Identification
Ping-Chen Huang
Denis Kleyko
J. Rabaey
Bruno A. Olshausen
P. Kanerva
81
2
0
28 Aug 2022
Target Speaker Voice Activity Detection with Transformers and Its
  Integration with End-to-End Neural Diarization
Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization
Dongmei Wang
Xiong Xiao
Naoyuki Kanda
Takuya Yoshioka
Jian Wu
87
29
0
27 Aug 2022
IndicSUPERB: A Speech Processing Universal Performance Benchmark for
  Indian languages
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages
Tahir Javed
Kaushal Bhogale
A. Raman
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
ELM
90
26
0
24 Aug 2022
Learning Branched Fusion and Orthogonal Projection for Face-Voice
  Association
Learning Branched Fusion and Orthogonal Projection for Face-Voice Association
M. S. Saeed
Shah Nawaz
M. H. Khan
S. Javed
Muhammad Haroon Yousaf
Alessio Del Bue
CVBM
74
4
0
22 Aug 2022
Learning in Audio-visual Context: A Review, Analysis, and New
  Perspective
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
135
55
0
20 Aug 2022
Disentangled Speaker Representation Learning via Mutual Information
  Minimization
Disentangled Speaker Representation Learning via Mutual Information Minimization
Sung Hwan Mun
Mingrui Han
Minchan Kim
Dongjune Lee
N. Kim
DRL
97
11
0
17 Aug 2022
Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle
  Transfer via Local-Style-Aware Hair Alignment
Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment
Taewoo Kim
Chaeyeon Chung
Yoonseong Kim
S. Park
Kangyeol Kim
Jaegul Choo
3DH
69
21
0
16 Aug 2022
FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction
  and Expression Editing
FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction and Expression Editing
Jingbo Zhang
Xiaoyu Li
Bo Liu
Can Wang
Jing Liao
3DHCVBM
138
42
0
11 Aug 2022
Non-Contrastive Self-supervised Learning for Utterance-Level Information
  Extraction from Speech
Non-Contrastive Self-supervised Learning for Utterance-Level Information Extraction from Speech
Jaejin Cho
Jesús Villalba
Laureano Moro-Velazquez
Najim Dehak
SSL
90
18
0
10 Aug 2022
Robust Acoustic Domain Identification with its Application to Speaker
  Diarization
Robust Acoustic Domain Identification with its Application to Speaker Diarization
Kishore Kumar A
Shefali Waldekar
Md. Sahidullah
G. Saha
52
0
0
05 Aug 2022
Attention and DCT based Global Context Modeling for Text-independent
  Speaker Recognition
Attention and DCT based Global Context Modeling for Text-independent Speaker Recognition
Wei Xia
John H. L. Hansen
65
4
0
04 Aug 2022
Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control
Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control
M. Doukas
Evangelos Ververas
V. Sharmanska
Stefanos Zafeiriou
CVBM
70
15
0
03 Aug 2022
The SJTU System for Short-duration Speaker Verification Challenge 2021
The SJTU System for Short-duration Speaker Verification Challenge 2021
Bing Han
Zhengyang Chen
Zhikai Zhou
Y. Qian
19
7
0
03 Aug 2022
Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label
  Correction
Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction
Bing Han
Zhengyang Chen
Y. Qian
61
32
0
03 Aug 2022
End-To-End Audiovisual Feature Fusion for Active Speaker Detection
End-To-End Audiovisual Feature Fusion for Active Speaker Detection
Fiseha B. Tesema
Zheyuan Lin
Shiqiang Zhu
Wei Song
J. Gu
Hong-Chuan Wu
42
4
0
27 Jul 2022
Previous
123...91011...212223
Next