ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXivPDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,100 papers shown
Title
Training speaker recognition systems with limited data
Training speaker recognition systems with limited data
Nik Vaessen
David A. van Leeuwen
24
6
0
28 Mar 2022
Thin-Plate Spline Motion Model for Image Animation
Thin-Plate Spline Motion Model for Image Animation
Jian Zhao
Hui Zhang
22
187
0
27 Mar 2022
End-to-End Active Speaker Detection
End-to-End Active Speaker Detection
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
24
27
0
27 Mar 2022
A Speech Representation Anonymization Framework via Selective Noise
  Perturbation
A Speech Representation Anonymization Framework via Selective Noise Perturbation
Minh Tran
M. Soleymani
35
4
0
26 Mar 2022
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio
  Representation Learning
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning
Sreyan Ghosh
Ashish Seth
and Deepak Mittal
Maneesh Singh
S. Umesh
SSL
27
6
0
25 Mar 2022
3D GAN Inversion for Controllable Portrait Image Animation
3D GAN Inversion for Controllable Portrait Image Animation
Connor Z. Lin
David B. Lindell
E. R. Chan
Gordon Wetzstein
3DH
21
61
0
25 Mar 2022
The VoicePrivacy 2022 Challenge Evaluation Plan
The VoicePrivacy 2022 Challenge Evaluation Plan
N. Tomashenko
Xin Wang
Xiaoxiao Miao
Hubert Nourtel
Pierre Champion
Massimiliano Todisco
Emmanuel Vincent
Nicholas W. D. Evans
Junichi Yamagishi
J. Bonastre
36
62
0
23 Mar 2022
Estimation of speaker age and height from speech signal using bi-encoder
  transformer mixture model
Estimation of speaker age and height from speech signal using bi-encoder transformer mixture model
Tarun Gupta
Duc-Tuan Truong
Tran The Anh
Chng Eng Siong
29
14
0
22 Mar 2022
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial
  Fine-Tuning Results for Child Speech Synthesis
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis
Rishabh Jain
Mariam Yiwere
Dan Bigioi
Peter Corcoran
H. Cucu
27
14
0
22 Mar 2022
Automated detection of foreground speech with wearable sensing in
  everyday home environments: A transfer learning approach
Automated detection of foreground speech with wearable sensing in everyday home environments: A transfer learning approach
Dawei Liang
Zifan Xu
Yinuo Chen
Rebecca Adaimi
David Harwath
Edison Thomaz
48
1
0
21 Mar 2022
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
51
6
0
21 Mar 2022
ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis
ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis
Jinlong Xue
Yayue Deng
Yichen Han
Ya Li
Jianqing Sun
Jiaen Liang
4
8
0
20 Mar 2022
Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Cho-Ying Wu
Chin-Cheng Hsu
Ulrich Neumann
CVBM
16
14
0
18 Mar 2022
TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
Ruiteng Zhang
Jianguo Wei
Xugang Lu
Wenhuan Lu
Di Jin
Junhai Xu
Lin Zhang
Y. Ji
J. Dang
20
4
0
17 Mar 2022
Pushing the limits of raw waveform speaker recognition
Pushing the limits of raw waveform speaker recognition
Jee-weon Jung
You Jin Kim
Hee-Soo Heo
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
31
87
0
16 Mar 2022
Depth-Aware Generative Adversarial Network for Talking Head Video
  Generation
Depth-Aware Generative Adversarial Network for Talking Head Video Generation
Fa-Ting Hong
Longhao Zhang
Li Shen
Dan Xu
3DH
CVBM
39
171
0
13 Mar 2022
StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via
  Pre-trained StyleGAN
StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN
Fei Yin
Yong Zhang
Xiaodong Cun
Ming Cao
Yanbo Fan
Xuanxia Wang
Qingyan Bai
Baoyuan Wu
Jue Wang
Yujiu Yang
CVBM
47
171
0
08 Mar 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with
  Articulatory Features
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features
Florian Lux
Ngoc Thang Vu
33
29
0
07 Mar 2022
Voice-Face Homogeneity Tells Deepfake
Voice-Face Homogeneity Tells Deepfake
Harry Cheng
Yangyang Guo
Tianyi Wang
Qi Li
Xiaojun Chang
Liqiang Nie
CVBM
41
68
0
04 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
45
106
0
02 Mar 2022
TRILLsson: Distilled Universal Paralinguistic Speech Representations
TRILLsson: Distilled Universal Paralinguistic Speech Representations
Joel Shor
Subhashini Venugopalan
35
37
0
01 Mar 2022
Magnitude-aware Probabilistic Speaker Embeddings
Magnitude-aware Probabilistic Speaker Embeddings
Nikita Kuzmin
Igor Fedorov
A. Sholokhov
29
7
0
28 Feb 2022
Language-Independent Speaker Anonymization Approach using
  Self-Supervised Pre-Trained Models
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
71
25
0
26 Feb 2022
Improving fairness in speaker verification via Group-adapted Fusion
  Network
Improving fairness in speaker verification via Group-adapted Fusion Network
Hua Shen
Yuguang Yang
G. Sun
Ryan Langman
Eunjung Han
J. Droppo
A. Stolcke
30
15
0
23 Feb 2022
Thinking the Fusion Strategy of Multi-reference Face Reenactment
Thinking the Fusion Strategy of Multi-reference Face Reenactment
T. Yashima
T. Narihira
Tamaki Kojima
DiffM
CVBM
38
1
0
22 Feb 2022
Contrastive-mixup learning for improved speaker verification
Contrastive-mixup learning for improved speaker verification
Xin Zhang
Minho Jin
R. Cheng
Ruirui Li
Eunjung Han
A. Stolcke
AAML
SSL
25
10
0
22 Feb 2022
Speaker Identity Preservation in Dysarthric Speech Reconstruction by
  Adversarial Speaker Adaptation
Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation
Disong Wang
Songxiang Liu
Xixin Wu
Hui Lu
Lifa Sun
Xunying Liu
Helen Meng
28
5
0
18 Feb 2022
Learning Temporal Point Processes for Efficient Retrieval of Continuous
  Time Event Sequences
Learning Temporal Point Processes for Efficient Retrieval of Continuous Time Event Sequences
Vinayak Gupta
Srikanta J. Bedathur
A. De
AI4TS
27
13
0
17 Feb 2022
I'm Hearing (Different) Voices: Anonymous Voices to Protect User Privacy
I'm Hearing (Different) Voices: Anonymous Voices to Protect User Privacy
H.C.M. Turner
Giulio Lovisotto
Simon Eberz
Ivan Martinovic
16
1
0
13 Feb 2022
Learnable Nonlinear Compression for Robust Speaker Verification
Learnable Nonlinear Compression for Robust Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
30
2
0
10 Feb 2022
Tubes Among Us: Analog Attack on Automatic Speaker Identification
Tubes Among Us: Analog Attack on Automatic Speaker Identification
Shimaa Ahmed
Yash R. Wani
Ali Shahin Shamsabadi
Mohammad Yaghini
Ilia Shumailov
Nicolas Papernot
Kassem Fawaz
AAML
46
4
0
06 Feb 2022
The CUHK-TENCENT speaker diarization system for the ICASSP 2022
  multi-channel multi-party meeting transcription challenge
The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge
Naijun Zheng
Na Li
Xixin Wu
Lingwei Meng
Jiawen Kang
Haibin Wu
Chao Weng
Dan Su
Helen Meng
33
10
0
04 Feb 2022
MFA: TDNN with Multi-scale Frequency-channel Attention for
  Text-independent Speaker Verification with Short Utterances
MFA: TDNN with Multi-scale Frequency-channel Attention for Text-independent Speaker Verification with Short Utterances
Tianchi Liu
Rohan Kumar Das
Kong Aik Lee
Haizhou Li
27
69
0
03 Feb 2022
Speaker Normalization for Self-supervised Speech Emotion Recognition
Speaker Normalization for Self-supervised Speech Emotion Recognition
Itai Gat
Hagai Aronowitz
Weizhong Zhu
E. Morais
R. Hoory
47
51
0
02 Feb 2022
Finding Directions in GAN's Latent Space for Neural Face Reenactment
Finding Directions in GAN's Latent Space for Neural Face Reenactment
Stella Bounareli
Vasileios Argyriou
Georgios Tzimiropoulos
3DH
CVBM
16
34
0
31 Jan 2022
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention
Artem Gorodetskii
Ivan Ozhiganov
30
2
0
25 Jan 2022
SASV Challenge 2022: A Spoofing Aware Speaker Verification Challenge
  Evaluation Plan
SASV Challenge 2022: A Spoofing Aware Speaker Verification Challenge Evaluation Plan
Jee-weon Jung
Hemlata Tak
Hye-jin Shim
Hee-Soo Heo
Bong-Jin Lee
Soo-Whan Chung
Hong-Goo Kang
Ha-Jin Yu
Nicholas W. D. Evans
Tomi Kinnunen
55
31
0
25 Jan 2022
Bias in Automated Speaker Recognition
Bias in Automated Speaker Recognition
Wiebke Toussaint
Aaron Yi Ding
CVBM
34
44
0
24 Jan 2022
Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker
  Classifier Joint Training
Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
J. Yang
Lei He
36
11
0
20 Jan 2022
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge
A. Brown
Jaesung Huh
Joon Son Chung
Arsha Nagrani
Daniel Garcia-Romero
Andrew Zisserman
31
40
0
12 Jan 2022
MERLOT Reserve: Neural Script Knowledge through Vision and Language and
  Sound
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound
Rowan Zellers
Jiasen Lu
Ximing Lu
Youngjae Yu
Yanpeng Zhao
Mohammadreza Salehi
Aditya Kusupati
Jack Hessel
Ali Farhadi
Yejin Choi
48
207
0
07 Jan 2022
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
36
48
0
27 Dec 2021
Graph attentive feature aggregation for text-independent speaker
  verification
Graph attentive feature aggregation for text-independent speaker verification
Hye-jin Shim
Ju-Sung Heo
Jae-han Park
Gareth Lee
Ha-Jin Yu
40
16
0
23 Dec 2021
Fusion and Orthogonal Projection for Improved Face-Voice Association
Fusion and Orthogonal Projection for Improved Face-Voice Association
Muhammad Saeed
M. H. Khan
Shah Nawaz
Muhammad Haroon Yousaf
Alessio Del Bue
CVBM
22
28
0
20 Dec 2021
Bootstrap Equilibrium and Probabilistic Speaker Representation Learning
  for Self-supervised Speaker Verification
Bootstrap Equilibrium and Probabilistic Speaker Representation Learning for Self-supervised Speaker Verification
Sung Hwan Mun
Min Hyun Han
Dongjune Lee
Jihwan Kim
N. Kim
SSL
43
3
0
16 Dec 2021
End-to-end speaker diarization with transformer
End-to-end speaker diarization with transformer
Yongquan Lai
Xin Tang
Yuanyuan Fu
Rui Fang
33
1
0
14 Dec 2021
Explore Long-Range Context feature for Speaker Verification
Explore Long-Range Context feature for Speaker Verification
Zhuo Li
33
6
0
14 Dec 2021
Smooth-Swap: A Simple Enhancement for Face-Swapping with Smoothness
Smooth-Swap: A Simple Enhancement for Face-Swapping with Smoothness
Jiseob Kim
Ji-Hyun Lee
Byoung-Tak Zhang
CVBM
19
44
0
11 Dec 2021
X-Vector based voice activity detection for multi-genre broadcast
  speech-to-text
X-Vector based voice activity detection for multi-genre broadcast speech-to-text
Misa Ogura
Matt Haynes
24
0
0
09 Dec 2021
Self-Supervised Speaker Verification with Simple Siamese Network and
  Self-Supervised Regularization
Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization
Mufan Sang
Haoqi Li
F. Liu
Andrew O. Arnold
Li Wan
SSL
16
40
0
08 Dec 2021
Previous
123...111213...202122
Next