ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXivPDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,100 papers shown
Title
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number
  of Speakers using End-to-End Speaker-Attributed ASR
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
Naoyuki Kanda
Xiong Xiao
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
29
36
0
07 Oct 2021
An Investigation of the Effectiveness of Phase for Audio Classification
An Investigation of the Effectiveness of Phase for Audio Classification
Shunsuke Hidaka
Kohei Wakamiya
T. Kaburagi
23
4
0
06 Oct 2021
Voice Aging with Audio-Visual Style Transfer
Voice Aging with Audio-Visual Style Transfer
Justin Wilson
Sunyeong Park
S. J. Wilson
Ming-Chia Lin
CVBM
13
0
0
05 Oct 2021
Multi-task Voice Activated Framework using Self-supervised Learning
Multi-task Voice Activated Framework using Self-supervised Learning
Shehzeen Samarah Hussain
V. Nguyen
Shuhua Zhang
Erik M. Visser
SSL
27
12
0
03 Oct 2021
Fine-tuning wav2vec2 for speaker recognition
Fine-tuning wav2vec2 for speaker recognition
Nik Vaessen
David A. van Leeuwen
47
107
0
30 Sep 2021
Multimodal Emotion Recognition with High-level Speech and Text Features
Multimodal Emotion Recognition with High-level Speech and Text Features
M. R. Makiuchi
Kuniaki Uto
Koichi Shinoda
12
70
0
29 Sep 2021
VoxCeleb Enrichment for Age and Gender Recognition
VoxCeleb Enrichment for Age and Gender Recognition
Khaled Hechmi
Trung Ngo Trong
Ville Hautamaki
Tomi Kinnunen
24
30
0
28 Sep 2021
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning
  for Automatic Speech Recognition
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
Daniel S. Park
Wei Han
James Qin
Anmol Gulati
...
Zhifeng Chen
Quoc V. Le
Chung-Cheng Chiu
Ruoming Pang
Yonghui Wu
SSL
34
175
0
27 Sep 2021
Optimized Power Normalized Cepstral Coefficients towards Robust Deep
  Speaker Verification
Optimized Power Normalized Cepstral Coefficients towards Robust Deep Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
38
6
0
24 Sep 2021
FaceEraser: Removing Facial Parts for Augmented Reality
FaceEraser: Removing Facial Parts for Augmented Reality
Miao Hua
Lijie Liu
Ziyang Cheng
Qian He
Bingchuan Li
Zili Yi
CVBM
PICV
3DH
20
0
0
22 Sep 2021
Improving Text-Independent Speaker Verification with Auxiliary Speakers
  Using Graph
Improving Text-Independent Speaker Verification with Auxiliary Speakers Using Graph
Jingyu Li
Si-Ioi Ng
Tan Lee
19
0
0
20 Sep 2021
FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera
  Manifold
FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera Manifold
Thomas Leimkuhler
G. Drettakis
CVBM
3DH
92
11
0
20 Sep 2021
PIRenderer: Controllable Portrait Image Generation via Semantic Neural
  Rendering
PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering
Yurui Ren
Gezhong Li
Yuanqi Chen
Thomas H. Li
Shan Liu
DiffM
VGen
49
225
0
17 Sep 2021
Self-Supervised Metric Learning With Graph Clustering For Speaker
  Diarization
Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization
Prachi Singh
Sriram Ganapathy
SSL
31
7
0
14 Sep 2021
Overlap-aware low-latency online speaker diarization based on end-to-end
  local segmentation
Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation
Juan Manuel Coria
H. Bredin
Sahar Ghannay
Sophie Rosset
54
30
0
14 Sep 2021
Studying squeeze-and-excitation used in CNN for speaker verification
Studying squeeze-and-excitation used in CNN for speaker verification
Mickael Rouvier
Pierre-Michel Bousquet
16
9
0
13 Sep 2021
Privacy-Protecting Techniques for Behavioral Biometric Data: A Survey
Privacy-Protecting Techniques for Behavioral Biometric Data: A Survey
Simon Hanisch
Patricia Arias-Cabarcos
Javier Parra-Arnau
Thorsten Strufe
CVBM
33
2
0
09 Sep 2021
The DKU-DukeECE System for the Self-Supervision Speaker Verification
  Task of the 2021 VoxCeleb Speaker Recognition Challenge
The DKU-DukeECE System for the Self-Supervision Speaker Verification Task of the 2021 VoxCeleb Speaker Recognition Challenge
Danwei Cai
Ming Li
11
15
0
07 Sep 2021
Improving Speaker Identification for Shared Devices by Adapting
  Embeddings to Speaker Subsets
Improving Speaker Identification for Shared Devices by Adapting Embeddings to Speaker Subsets
Zhenning Tan
Yuguang Yang
Eunjung Han
A. Stolcke
18
5
0
06 Sep 2021
Deep Person Generation: A Survey from the Perspective of Face, Pose and
  Cloth Synthesis
Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis
Tong Sha
Wei Zhang
T. Shen
Zhoujun Li
Tao Mei
40
38
0
05 Sep 2021
The ByteDance Speaker Diarization System for the VoxCeleb Speaker
  Recognition Challenge 2021
The ByteDance Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2021
Keke Wang
Xudong Mao
Hao Wu
Chen Ding
Chuxiang Shang
Rui Xia
Yuxuan Wang
28
13
0
05 Sep 2021
The SpeakIn System for VoxCeleb Speaker Recognition Challange 2021
The SpeakIn System for VoxCeleb Speaker Recognition Challange 2021
Miao Zhao
Yufeng Ma
Min Liu
Minqiang Xu
33
59
0
05 Sep 2021
The VoicePrivacy 2020 Challenge: Results and findings
The VoicePrivacy 2020 Challenge: Results and findings
N. Tomashenko
Xin Wang
Emmanuel Vincent
J. Patino
B. M. L. Srivastava
...
Benjamin O’Brien
Anais Chanclu
J. Bonastre
Massimiliano Todisco
Mohamed Maouche
46
106
0
01 Sep 2021
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets
  Development
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development
Mingkuan Liu
Chi Zhang
Hua Xing
C. Feng
Mon-Chu Chen
Judith Bishop
Grace Ngapo
30
3
0
01 Sep 2021
Sparse to Dense Motion Transfer for Face Image Animation
Sparse to Dense Motion Transfer for Face Image Animation
Ruiqi Zhao
Tianyi Wu
Guodong Guo
3DH
CVBM
42
27
0
01 Sep 2021
ASVspoof 2021: accelerating progress in spoofed and deepfake speech
  detection
ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection
Junichi Yamagishi
Xin Wang
Massimiliano Todisco
Md. Sahidullah
J. Patino
...
Xuechen Liu
Kong Aik Lee
Tomi Kinnunen
Nicholas W. D. Evans
Héctor Delgado
33
336
0
01 Sep 2021
RSKNet-MTSP: Effective and Portable Deep Architecture for Speaker
  Verification
RSKNet-MTSP: Effective and Portable Deep Architecture for Speaker Verification
Yanfeng Wu
Chenkai Guo
Junan Zhao
Xiao Jin
Jing Xu
24
13
0
30 Aug 2021
Look Who's Talking: Active Speaker Detection in the Wild
Look Who's Talking: Active Speaker Detection in the Wild
You Jin Kim
Hee-Soo Heo
Soyeon Choe
Soo-Whan Chung
Yoohwan Kwon
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
52
20
0
17 Aug 2021
Learning Facial Representations from the Cycle-consistency of Face
Learning Facial Representations from the Cycle-consistency of Face
Jia-Ren Chang
Yonghao Chen
W. Chiu
CVBM
30
29
0
07 Aug 2021
Combining Attention with Flow for Person Image Synthesis
Combining Attention with Flow for Person Image Synthesis
Yurui Ren
Yubo Wu
Thomas H. Li
Sha Liu
Ge Li
3DH
30
19
0
04 Aug 2021
On the Exploitability of Audio Machine Learning Pipelines to
  Surreptitious Adversarial Examples
On the Exploitability of Audio Machine Learning Pipelines to Surreptitious Adversarial Examples
Adelin Travers
Lorna Licollari
Guanghan Wang
Varun Chandrasekaran
Adam Dziedzic
David Lie
Nicolas Papernot
AAML
35
3
0
03 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
Zhaofeng Shi
26
7
0
01 Aug 2021
Practical Attacks on Voice Spoofing Countermeasures
Practical Attacks on Voice Spoofing Countermeasures
Andre Kassis
Urs Hengartner
AAML
17
14
0
30 Jul 2021
SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification
SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification
Wiebke Toussaint
Aaron Yi Ding
27
10
0
26 Jul 2021
Significance of Speaker Embeddings and Temporal Context for Depression
  Detection
Significance of Speaker Embeddings and Temporal Context for Depression Detection
Sri Harsha Dumpala
Sebastian Rodriguez
S. Rempel
Rudolf Uher
Sageev Oore
28
4
0
24 Jul 2021
Multi-modal Residual Perceptron Network for Audio-Video Emotion
  Recognition
Multi-modal Residual Perceptron Network for Audio-Video Emotion Recognition
Xin Chang
W. Skarbek
30
19
0
21 Jul 2021
A Tandem Framework Balancing Privacy and Security for Voice User
  Interfaces
A Tandem Framework Balancing Privacy and Security for Voice User Interfaces
Ranya Aloufi
Hamed Haddadi
David E. Boyle
42
2
0
21 Jul 2021
A Real-time Speaker Diarization System Based on Spatial Spectrum
A Real-time Speaker Diarization System Based on Spatial Spectrum
Siqi Zheng
Weilong Huang
Xianliang Wang
Hongbin Suo
Jinwei Feng
Zhijie Yan
19
24
0
20 Jul 2021
Audio2Head: Audio-driven One-shot Talking-head Generation with Natural
  Head Motion
Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion
Suzhe Wang
Lincheng Li
Yu-qiong Ding
Changjie Fan
Xin Yu
VGen
41
161
0
20 Jul 2021
Controlled AutoEncoders to Generate Faces from Voices
Controlled AutoEncoders to Generate Faces from Voices
Hao Liang
Lulan Yu
Gu Xu
Bhiksha Raj
Rita Singh
CVBM
15
4
0
16 Jul 2021
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding
Hongning Zhu
Kong Aik Lee
Haizhou Li
38
15
0
14 Jul 2021
MACCIF-TDNN: Multi aspect aggregation of channel and context
  interdependence features in TDNN-based speaker verification
MACCIF-TDNN: Multi aspect aggregation of channel and context interdependence features in TDNN-based speaker verification
Fangyuan Wang
Z. Song
Hongchen Jiang
Bo Xu
43
8
0
07 Jul 2021
A Comparative Study of Modular and Joint Approaches for
  Speaker-Attributed ASR on Monaural Long-Form Audio
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
21
14
0
06 Jul 2021
Multi-modality Deep Restoration of Extremely Compressed Face Videos
Multi-modality Deep Restoration of Extremely Compressed Face Videos
Xi Zhang
Xiaolin Wu
CVBM
22
13
0
05 Jul 2021
The HCCL Speaker Verification System for Far-Field Speaker Verification
  Challenge
The HCCL Speaker Verification System for Far-Field Speaker Verification Challenge
Zhuo Li
Ce Fang
Runqiu Xiao
Zhigao Chen
Wenchao Wang
Yonghong Yan
25
2
0
03 Jul 2021
Pretext Tasks selection for multitask self-supervised speech
  representation learning
Pretext Tasks selection for multitask self-supervised speech representation learning
Salah Zaiem
Titouan Parcollet
S. Essid
Abdel Heba
SSL
24
12
0
01 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and
  Channel Information? A Layer-wise and Neuron-level Analysis
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
Shammur A. Chowdhury
Nadir Durrani
Ahmed M. Ali
49
12
0
01 Jul 2021
Adversarial Sample Detection for Speaker Verification by Neural Vocoders
Adversarial Sample Detection for Speaker Verification by Neural Vocoders
Haibin Wu
Po-Chun Hsu
Ji Gao
Shanshan Zhang
Shen Huang
Jian Kang
Zhiyong Wu
Helen Meng
Hung-yi Lee
AAML
38
20
0
01 Jul 2021
QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic
  Speech Corpus
QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus
Hamdy Mubarak
A. Hussein
Shammur A. Chowdhury
Ahmed M. Ali
24
44
0
24 Jun 2021
Graph-based Label Propagation for Semi-Supervised Speaker Identification
Graph-based Label Propagation for Semi-Supervised Speaker Identification
Long Chen
Venkatesh Ravichandran
A. Stolcke
SSL
27
16
0
15 Jun 2021
Previous
123...131415...202122
Next