ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset
v1v2 (latest)

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,111 papers shown
Title
Large-scale Self-Supervised Speech Representation Learning for Automatic
  Speaker Verification
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification
Zhengyang Chen
Sanyuan Chen
Yu-Huan Wu
Yao Qian
Chengyi Wang
Shujie Liu
Y. Qian
Michael Zeng
SSL
84
129
0
12 Oct 2021
Multi-View Self-Attention Based Transformer for Speaker Recognition
Multi-View Self-Attention Based Transformer for Speaker Recognition
Rui Wang
Junyi Ao
Long Zhou
Shujie Liu
Zhihua Wei
Tom Ko
Qing Li
Yu Zhang
ViT
71
32
0
11 Oct 2021
Self-Supervised 3D Face Reconstruction via Conditional Estimation
Self-Supervised 3D Face Reconstruction via Conditional Estimation
Yandong Wen
Weiyang Liu
Bhiksha Raj
Rita Singh
CVBM
149
22
0
10 Oct 2021
Fine-grained Identity Preserving Landmark Synthesis for Face Reenactment
Fine-grained Identity Preserving Landmark Synthesis for Face Reenactment
Haichao Zhang
Youcheng Ben
Weixiao Zhang
Tao Chen
Gang Yu
Bin-Bin Fu
CVBM
23
2
0
10 Oct 2021
Poformer: A simple pooling transformer for speaker verification
Poformer: A simple pooling transformer for speaker verification
Yufeng Ma
Yiwei Ding
Miao Zhao
Yu Zheng
Min Liu
Minqiang Xu
ViT
58
2
0
10 Oct 2021
Differential Motion Evolution for Fine-Grained Motion Deformation in
  Unsupervised Image Animation
Differential Motion Evolution for Fine-Grained Motion Deformation in Unsupervised Image Animation
Peirong Liu
Rui Wang
Xuefei Cao
Yipin Zhou
Ashish Shah
Ser-Nam Lim
DiffM
73
3
0
09 Oct 2021
Universal Paralinguistic Speech Representations Using Self-Supervised
  Conformers
Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
Joel Shor
A. Jansen
Wei Han
Daniel S. Park
Yu Zhang
SSLAI4TS
129
59
0
09 Oct 2021
Towards Lightweight Applications: Asymmetric Enroll-Verify Structure for
  Speaker Verification
Towards Lightweight Applications: Asymmetric Enroll-Verify Structure for Speaker Verification
Qingjian Lin
Lin Yang
Xuyang Wang
Xiaoyi Qin
Junjie Wang
Ming Li
75
21
0
09 Oct 2021
A study of the robustness of raw waveform based speaker embeddings under
  mismatched conditions
A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
Ge Zhu
Frank Cwitkowitz
Z. Duan
55
2
0
08 Oct 2021
Advancing the dimensionality reduction of speaker embeddings for speaker
  diarisation: disentangling noise and informing speech activity
Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity
You Jin Kim
Hee-Soo Heo
Jee-weon Jung
Youngki Kwon
Bong-Jin Lee
Joon Son Chung
84
3
0
07 Oct 2021
Multi-scale speaker embedding-based graph attention networks for speaker
  diarisation
Multi-scale speaker embedding-based graph attention networks for speaker diarisation
Youngki Kwon
Hee-Soo Heo
Jee-weon Jung
You Jin Kim
Bong-Jin Lee
Joon Son Chung
96
19
0
07 Oct 2021
Transferring Voice Knowledge for Acoustic Event Detection: An Empirical
  Study
Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Dawei Liang
Yangyang Shi
Yun Wang
Nayan Singhal
Alex Xiao
Jonathan Shaw
Edison Thomaz
Ozlem Kalinli
M. Seltzer
50
4
0
07 Oct 2021
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number
  of Speakers using End-to-End Speaker-Attributed ASR
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR
Naoyuki Kanda
Xiong Xiao
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
86
40
0
07 Oct 2021
An Investigation of the Effectiveness of Phase for Audio Classification
An Investigation of the Effectiveness of Phase for Audio Classification
Shunsuke Hidaka
Kohei Wakamiya
T. Kaburagi
28
4
0
06 Oct 2021
Voice Aging with Audio-Visual Style Transfer
Voice Aging with Audio-Visual Style Transfer
Justin Wilson
Sunyeong Park
S. J. Wilson
Ming-Chia Lin
CVBM
82
0
0
05 Oct 2021
Multi-task Voice Activated Framework using Self-supervised Learning
Multi-task Voice Activated Framework using Self-supervised Learning
Shehzeen Samarah Hussain
V. Nguyen
Shuhua Zhang
Erik M. Visser
SSL
113
12
0
03 Oct 2021
Fine-tuning wav2vec2 for speaker recognition
Fine-tuning wav2vec2 for speaker recognition
Nik Vaessen
David A. van Leeuwen
116
108
0
30 Sep 2021
Multimodal Emotion Recognition with High-level Speech and Text Features
Multimodal Emotion Recognition with High-level Speech and Text Features
M. R. Makiuchi
Kuniaki Uto
Koichi Shinoda
77
72
0
29 Sep 2021
VoxCeleb Enrichment for Age and Gender Recognition
VoxCeleb Enrichment for Age and Gender Recognition
Khaled Hechmi
Trung Ngo Trong
Ville Hautamaki
Tomi Kinnunen
80
30
0
28 Sep 2021
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning
  for Automatic Speech Recognition
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
Daniel S. Park
Wei Han
James Qin
Anmol Gulati
...
Zhifeng Chen
Quoc V. Le
Chung-Cheng Chiu
Ruoming Pang
Yonghui Wu
SSL
86
176
0
27 Sep 2021
Optimized Power Normalized Cepstral Coefficients towards Robust Deep
  Speaker Verification
Optimized Power Normalized Cepstral Coefficients towards Robust Deep Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
62
6
0
24 Sep 2021
FaceEraser: Removing Facial Parts for Augmented Reality
FaceEraser: Removing Facial Parts for Augmented Reality
Miao Hua
Lijie Liu
Ziyang Cheng
Qian He
Bingchuan Li
Zili Yi
CVBMPICV3DH
24
0
0
22 Sep 2021
Improving Text-Independent Speaker Verification with Auxiliary Speakers
  Using Graph
Improving Text-Independent Speaker Verification with Auxiliary Speakers Using Graph
Jingyu Li
Si-Ioi Ng
Tan Lee
43
0
0
20 Sep 2021
FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera
  Manifold
FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera Manifold
Thomas Leimkuhler
G. Drettakis
CVBM3DH
125
11
0
20 Sep 2021
PIRenderer: Controllable Portrait Image Generation via Semantic Neural
  Rendering
PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering
Yurui Ren
Gezhong Li
Yuanqi Chen
Thomas H. Li
Shan Liu
DiffMVGen
120
230
0
17 Sep 2021
Self-Supervised Metric Learning With Graph Clustering For Speaker
  Diarization
Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization
Prachi Singh
Sriram Ganapathy
SSL
57
7
0
14 Sep 2021
Overlap-aware low-latency online speaker diarization based on end-to-end
  local segmentation
Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation
Juan Manuel Coria
H. Bredin
Sahar Ghannay
Sophie Rosset
76
30
0
14 Sep 2021
Studying squeeze-and-excitation used in CNN for speaker verification
Studying squeeze-and-excitation used in CNN for speaker verification
Mickael Rouvier
Pierre-Michel Bousquet
37
10
0
13 Sep 2021
Privacy-Protecting Techniques for Behavioral Biometric Data: A Survey
Privacy-Protecting Techniques for Behavioral Biometric Data: A Survey
Simon Hanisch
Patricia Arias-Cabarcos
Javier Parra-Arnau
Thorsten Strufe
CVBM
61
3
0
09 Sep 2021
The DKU-DukeECE System for the Self-Supervision Speaker Verification
  Task of the 2021 VoxCeleb Speaker Recognition Challenge
The DKU-DukeECE System for the Self-Supervision Speaker Verification Task of the 2021 VoxCeleb Speaker Recognition Challenge
Danwei Cai
Ming Li
51
15
0
07 Sep 2021
Improving Speaker Identification for Shared Devices by Adapting
  Embeddings to Speaker Subsets
Improving Speaker Identification for Shared Devices by Adapting Embeddings to Speaker Subsets
Zhenning Tan
Yuguang Yang
Eunjung Han
A. Stolcke
42
5
0
06 Sep 2021
Deep Person Generation: A Survey from the Perspective of Face, Pose and
  Cloth Synthesis
Deep Person Generation: A Survey from the Perspective of Face, Pose and Cloth Synthesis
Tong Sha
Wei Zhang
T. Shen
Zhoujun Li
Tao Mei
80
39
0
05 Sep 2021
The ByteDance Speaker Diarization System for the VoxCeleb Speaker
  Recognition Challenge 2021
The ByteDance Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2021
Keke Wang
Xudong Mao
Hao Wu
Chen Ding
Chuxiang Shang
Rui Xia
Yuxuan Wang
60
13
0
05 Sep 2021
The SpeakIn System for VoxCeleb Speaker Recognition Challange 2021
The SpeakIn System for VoxCeleb Speaker Recognition Challange 2021
Miao Zhao
Yufeng Ma
Min Liu
Minqiang Xu
78
59
0
05 Sep 2021
The VoicePrivacy 2020 Challenge: Results and findings
The VoicePrivacy 2020 Challenge: Results and findings
N. Tomashenko
Xin Wang
Emmanuel Vincent
J. Patino
B. M. L. Srivastava
...
Benjamin O’Brien
Anais Chanclu
J. Bonastre
Massimiliano Todisco
Mohamed Maouche
149
109
0
01 Sep 2021
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets
  Development
Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development
Mingkuan Liu
Fangqiu Yi
Hua Xing
C. Feng
Mon-Chu Chen
Judith Bishop
Grace Ngapo
57
3
0
01 Sep 2021
Sparse to Dense Motion Transfer for Face Image Animation
Sparse to Dense Motion Transfer for Face Image Animation
Ruiqi Zhao
Tianyi Wu
Guodong Guo
3DHCVBM
84
28
0
01 Sep 2021
ASVspoof 2021: accelerating progress in spoofed and deepfake speech
  detection
ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection
Junichi Yamagishi
Xin Wang
Massimiliano Todisco
Md. Sahidullah
J. Patino
...
Xuechen Liu
Kong Aik Lee
Tomi Kinnunen
Nicholas W. D. Evans
Héctor Delgado
77
351
0
01 Sep 2021
RSKNet-MTSP: Effective and Portable Deep Architecture for Speaker
  Verification
RSKNet-MTSP: Effective and Portable Deep Architecture for Speaker Verification
Yanfeng Wu
Chenkai Guo
Junan Zhao
Xiao Jin
Jing Xu
72
14
0
30 Aug 2021
Look Who's Talking: Active Speaker Detection in the Wild
Look Who's Talking: Active Speaker Detection in the Wild
You Jin Kim
Hee-Soo Heo
Soyeon Choe
Soo-Whan Chung
Yoohwan Kwon
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
113
21
0
17 Aug 2021
Learning Facial Representations from the Cycle-consistency of Face
Learning Facial Representations from the Cycle-consistency of Face
Jia-Ren Chang
Yonghao Chen
W. Chiu
CVBM
101
29
0
07 Aug 2021
Combining Attention with Flow for Person Image Synthesis
Combining Attention with Flow for Person Image Synthesis
Yurui Ren
Yubo Wu
Thomas H. Li
Sha Liu
Ge Li
3DH
57
19
0
04 Aug 2021
On the Exploitability of Audio Machine Learning Pipelines to
  Surreptitious Adversarial Examples
On the Exploitability of Audio Machine Learning Pipelines to Surreptitious Adversarial Examples
Adelin Travers
Lorna Licollari
Guanghan Wang
Varun Chandrasekaran
Adam Dziedzic
David Lie
Nicolas Papernot
AAML
62
3
0
03 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing
Zhaofeng Shi
57
7
0
01 Aug 2021
Practical Attacks on Voice Spoofing Countermeasures
Practical Attacks on Voice Spoofing Countermeasures
Andre Kassis
Urs Hengartner
AAML
49
15
0
30 Jul 2021
SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification
SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification
Wiebke Toussaint
Aaron Yi Ding
56
10
0
26 Jul 2021
Significance of Speaker Embeddings and Temporal Context for Depression
  Detection
Significance of Speaker Embeddings and Temporal Context for Depression Detection
Sri Harsha Dumpala
Sebastian Rodriguez
S. Rempel
Rudolf Uher
Sageev Oore
72
4
0
24 Jul 2021
Multi-modal Residual Perceptron Network for Audio-Video Emotion
  Recognition
Multi-modal Residual Perceptron Network for Audio-Video Emotion Recognition
Xin Chang
W. Skarbek
57
20
0
21 Jul 2021
A Tandem Framework Balancing Privacy and Security for Voice User
  Interfaces
A Tandem Framework Balancing Privacy and Security for Voice User Interfaces
Ranya Aloufi
Hamed Haddadi
David E. Boyle
90
3
0
21 Jul 2021
A Real-time Speaker Diarization System Based on Spatial Spectrum
A Real-time Speaker Diarization System Based on Spatial Spectrum
Siqi Zheng
Weilong Huang
Xianliang Wang
Hongbin Suo
Jinwei Feng
Zhijie Yan
69
24
0
20 Jul 2021
Previous
123...131415...212223
Next