ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXivPDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,100 papers shown
Title
VideoReTalking: Audio-based Lip Synchronization for Talking Head Video
  Editing In the Wild
VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
K. Cheng
Xiaodong Cun
Yong Zhang
Menghan Xia
Fei Yin
Mingrui Zhu
Xuanxia Wang
Jue Wang
Nan Wang
CVBM
30
92
0
27 Nov 2022
Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent
  Portrait Synthesis from Monocular Image
Learning Detailed Radiance Manifolds for High-Fidelity and 3D-Consistent Portrait Synthesis from Monocular Image
Yu Deng
Baoyuan Wang
H. Shum
3DH
27
10
0
25 Nov 2022
Pose-disentangled Contrastive Learning for Self-supervised Facial
  Representation
Pose-disentangled Contrastive Learning for Self-supervised Facial Representation
Y. Liu
Wenbin Wang
Yibing Zhan
Shaoze Feng
Li-Yu Daisy Liu
Zhe Chen
SSL
29
13
0
24 Nov 2022
A new Speech Feature Fusion method with cross gate parallel CNN for
  Speaker Recognition
A new Speech Feature Fusion method with cross gate parallel CNN for Speaker Recognition
Jiacheng Zhang
Wenyi Yan
Ye Zhang
31
2
0
24 Nov 2022
Semantic-aware One-shot Face Re-enactment with Dense Correspondence
  Estimation
Semantic-aware One-shot Face Re-enactment with Dense Correspondence Estimation
Yunfan Liu
Qi Li
Zhen Sun
Tieniu Tan
CVBM
11
0
0
23 Nov 2022
Complex-Valued Time-Frequency Self-Attention for Speech Dereverberation
Complex-Valued Time-Frequency Self-Attention for Speech Dereverberation
Vinay Kothapally
John H. L. Hansen
37
9
0
22 Nov 2022
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized
  Audio-Driven Single Image Talking Face Animation
SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Wenxuan Zhang
Xiaodong Cun
Xuan Wang
Yong Zhang
Xiaodong Shen
Yu-Xiao Guo
Ying Shan
Fei Wang
VGen
50
235
0
22 Nov 2022
Robust Training for Speaker Verification against Noisy Labels
Robust Training for Speaker Verification against Noisy Labels
Zhihua Fang
Liang He
Hanhan Ma
Xiao-Min Guo
Lin Li
NoLa
35
3
0
22 Nov 2022
Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy
  Environments
Speaker Adaptation for End-To-End Speech Recognition Systems in Noisy Environments
Dominik Wagner
Ilja Baumann
Sebastian P. Bayerl
Korbinian Riedhammer
Tobias Bocklet
47
2
0
16 Nov 2022
Towards an objective characterization of an individual's facial
  movements using Self-Supervised Person-Specific-Models
Towards an objective characterization of an individual's facial movements using Self-Supervised Person-Specific-Models
Yanis Tazi
M. Berger
W. Freiwald
14
0
0
15 Nov 2022
Multi-Label Training for Text-Independent Speaker Identification
Multi-Label Training for Text-Independent Speaker Identification
Yuqi Xue
27
0
0
14 Nov 2022
Towards A Unified Conformer Structure: from ASR to ASV Task
Towards A Unified Conformer Structure: from ASR to ASV Task
Dexin Liao
Tao Jiang
Feng Wang
Lin Li
Q. Hong
30
10
0
14 Nov 2022
Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing
  Countermeasure Against Codec Variabilities
Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing Countermeasure Against Codec Variabilities
Yikang Wang
Xingming Wang
Hiromitsu Nishizaki
Ming Li
27
6
0
12 Nov 2022
Speech separation with large-scale self-supervised learning
Speech separation with large-scale self-supervised learning
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yu-Huan Wu
Xiaofei Wang
Takuya Yoshioka
Jinyu Li
S. Sivasankaran
Sefik Emre Eskimez
21
14
0
09 Nov 2022
Pushing the limits of self-supervised speaker verification using
  regularized distillation framework
Pushing the limits of self-supervised speaker verification using regularized distillation framework
Yafeng Chen
Siqi Zheng
Haibo Wang
Luyao Cheng
Qian Chen
25
25
0
08 Nov 2022
High-resolution embedding extractor for speaker diarisation
High-resolution embedding extractor for speaker diarisation
Hee-Soo Heo
Youngki Kwon
Bong-Jin Lee
You Jin Kim
Jee-weon Jung
32
5
0
08 Nov 2022
SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker
  Embedding and Vision Transformers
SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers
Alessandro Arezzo
Stefano Berretti
ViT
35
15
0
04 Nov 2022
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
Ju-ho Kim
Ju-Sung Heo
Hyun-Seo Shin
Chanmann Lim
Ha-Jin Yu
26
5
0
04 Nov 2022
Dynamic Kernels and Channel Attention for Low Resource Speaker
  Verification
Dynamic Kernels and Channel Attention for Low Resource Speaker Verification
A. Ollerenshaw
Md. Asif Jalal
Thomas Hain
19
0
0
03 Nov 2022
SLICER: Learning universal audio representations using low-resource
  self-supervised pre-training
SLICER: Learning universal audio representations using low-resource self-supervised pre-training
Ashish Seth
Sreyan Ghosh
S. Umesh
Tianyi Zhou
SSL
36
3
0
02 Nov 2022
MAST: Multiscale Audio Spectrogram Transformers
MAST: Multiscale Audio Spectrogram Transformers
Sreyan Ghosh
Ashish Seth
S. Umesh
Tianyi Zhou
22
3
0
02 Nov 2022
LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker
  Verification
LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker Verification
Xingqi Chen
Jie Wang
Xiaoli Zhang
Weiqiang Zhang
Kunde Yang
AAML
28
7
0
02 Nov 2022
Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Zhengyang Chen
Bing Han
Xu Xiang
Houjun Huang
Bei Liu
Y. Qian
32
13
0
02 Nov 2022
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech
  Recognition in Multi-party Meetings
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings
Mohan Shi
Jie Zhang
Zhihao Du
Fan Yu
Qian Chen
Shiliang Zhang
Lirong Dai
51
4
0
01 Nov 2022
Adapting self-supervised models to multi-talker speech recognition using
  speaker embeddings
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings
Zili Huang
Desh Raj
Leibny Paola García-Perera
Sanjeev Khudanpur
101
24
0
01 Nov 2022
Disentangled representation learning for multilingual speaker
  recognition
Disentangled representation learning for multilingual speaker recognition
KiHyun Nam
You-kyong. Kim
Jaesung Huh
Hee-Soo Heo
Jee-weon Jung
Joon Son Chung
53
7
0
01 Nov 2022
Model Compression for DNN-based Speaker Verification Using Weight
  Quantization
Model Compression for DNN-based Speaker Verification Using Weight Quantization
Jingyu Li
W. Liu
Zhaoyang Zhang
Jiong Wang
Tan Lee
MQ
24
3
0
31 Oct 2022
Convolution-Based Channel-Frequency Attention for Text-Independent
  Speaker Verification
Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification
Jingyu Li
Yusheng Tian
Tan Lee
36
9
0
31 Oct 2022
Combining Automatic Speaker Verification and Prosody Analysis for
  Synthetic Speech Detection
Combining Automatic Speaker Verification and Prosody Analysis for Synthetic Speech Detection
L. Attorresi
Davide Salvi
Clara Borrelli
Paolo Bestagini
Stefano Tubaro
26
22
0
31 Oct 2022
Application of Knowledge Distillation to Multi-task Speech
  Representation Learning
Application of Knowledge Distillation to Multi-task Speech Representation Learning
Mine Kerpicci
V. Nguyen
Shuhua Zhang
Erik M. Visser
35
0
0
29 Oct 2022
Universal speaker recognition encoders for different speech segments
  duration
Universal speaker recognition encoders for different speech segments duration
Sergey Novoselov
V. Volokhov
G. Lavrentyeva
12
2
0
28 Oct 2022
Target-Speaker Voice Activity Detection via Sequence-to-Sequence
  Prediction
Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction
Ming Cheng
Weiqing Wang
Yucong Zhang
Xiaoyi Qin
Ming Li
VLM
56
33
0
28 Oct 2022
Parameter-efficient transfer learning of pre-trained Transformer models
  for speaker verification using adapters
Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters
Junyi Peng
Themos Stafylakis
Rongzhi Gu
Oldvrich Plchot
Ladislav Movsner
Lukávs Burget
JanHonza'' vCernocký
47
22
0
28 Oct 2022
Laugh Betrays You? Learning Robust Speaker Representation From Speech Containing Non-Verbal Fragments
Yuke Lin
Xiaoyi Qin
Huahua Cui
Zhenyi Zhu
Ming Li
24
1
0
28 Oct 2022
A comprehensive study on self-supervised distillation for speaker
  representation learning
A comprehensive study on self-supervised distillation for speaker representation learning
Zhengyang Chen
Yao Qian
Bing Han
Y. Qian
Michael Zeng
SSL
46
17
0
28 Oct 2022
Speaker recognition with two-step multi-modal deep cleansing
Speaker recognition with two-step multi-modal deep cleansing
Ruijie Tao
Kong Aik Lee
Zhan Shi
Haizhou Li
NoLa
52
13
0
28 Oct 2022
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse
  Positive Pairs
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs
Ruijie Tao
Kong Aik Lee
Rohan Kumar Das
Ville Hautamaki
Haizhou Li
SSL
34
10
0
27 Oct 2022
V-Cloak: Intelligibility-, Naturalness- & Timbre-Preserving Real-Time
  Voice Anonymization
V-Cloak: Intelligibility-, Naturalness- & Timbre-Preserving Real-Time Voice Anonymization
Jiangyi Deng
Fei Teng
Yanjiao Chen
Xiaofu Chen
Zhaohui Wang
Wenyuan Xu
13
11
0
27 Oct 2022
Privacy-preserving Automatic Speaker Diarization
Privacy-preserving Automatic Speaker Diarization
Francisco Teixeira
A. Abad
Bhiksha Raj
Isabel Trancoso
30
4
0
26 Oct 2022
In search of strong embedding extractors for speaker diarisation
In search of strong embedding extractors for speaker diarisation
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesung Huh
A. Brown
Youngki Kwon
Shinji Watanabe
Joon Son Chung
44
16
0
26 Oct 2022
TSUP Speaker Diarization System for Conversational Short-phrase Speaker
  Diarization Challenge
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
Bowen Pang
Huan Zhao
Gaosheng Zhang
Xiaoyue Yang
Yanguo Sun
Li Zhang
Qing Wang
Linfu Xie
BDL
28
2
0
26 Oct 2022
Masked Modeling Duo: Learning Representations by Encouraging Both
  Networks to Model the Input
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
SSL
40
31
0
26 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation
Evonne Lee
Guangzhi Sun
Chuxu Zhang
P. Woodland
27
1
0
24 Oct 2022
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker
  Embeddings for Target Speaker Separation
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation
Xiaoyu Liu
Xu Li
Joan Serrà
44
9
0
23 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Low-Resource Multilingual and Zero-Shot Multispeaker TTS
Florian Lux
Julia Koch
Ngoc Thang Vu
40
22
0
21 Oct 2022
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning
  Pretrained Models in Speech Analysis
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning Pretrained Models in Speech Analysis
Florian Lux
Ching-Yi Chen
Ngoc Thang Vu
23
1
0
21 Oct 2022
Large-scale learning of generalised representations for speaker
  recognition
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
36
6
0
20 Oct 2022
Risk of re-identification for shared clinical speech recordings
Risk of re-identification for shared clinical speech recordings
D. Wiepert
B. Malin
Joseph James Duffy
Rene L. Utianski
John L. Stricker
David T. Jones
Hugo Botha
48
0
0
18 Oct 2022
How to Leverage DNN-based speech enhancement for multi-channel speaker
  verification?
How to Leverage DNN-based speech enhancement for multi-channel speaker verification?
Sandipana Dowerah
Romain Serizel
D. Jouvet
Mohammad MohammadAmini
D. Matrouf
45
0
0
17 Oct 2022
Extracting speaker and emotion information from self-supervised speech
  models via channel-wise correlations
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Themos Stafylakis
Ladislav Mošner
Sofoklis Kakouros
Oldrich Plchot
L. Burget
J. Černocký
SSL
40
8
0
15 Oct 2022
Previous
123...8910...202122
Next