ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXivPDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,100 papers shown
Title
A Study on Bias and Fairness In Deep Speaker Recognition
A Study on Bias and Fairness In Deep Speaker Recognition
Amirhossein Hajavi
Ali Etemad
27
2
0
14 Mar 2023
Single-branch Network for Multimodal Training
Single-branch Network for Multimodal Training
M. S. Saeed
Shah Nawaz
M. H. Khan
M. Zaheer
Karthik Nandakumar
Muhammad Haroon Yousaf
Arif Mahmood
19
12
0
10 Mar 2023
Self-supervised Facial Action Unit Detection with Region and Relation
  Learning
Self-supervised Facial Action Unit Detection with Region and Relation Learning
Juan Song
Zhilei Liu
ViT
23
1
0
10 Mar 2023
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
Ashish Seth
Sreyan Ghosh
S. Umesh
Dinesh Manocha
27
0
0
10 Mar 2023
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual
  Fine-Grained Learning
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Ruize Xu
Ruoxuan Feng
Shi-Xiong Zhang
Di Hu
39
21
0
09 Mar 2023
WASD: A Wilder Active Speaker Detection Dataset
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
24
3
0
09 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature
  Diversity and Decorrelation
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
47
3
0
07 Mar 2023
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker
  Verification
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
42
4
0
02 Mar 2023
DISPLACE Challenge: DIarization of SPeaker and LAnguage in
  Conversational Environments
DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Sidharth Sidharth
Ranjana H
Prachi Singh
...
Pratik Roy Chowdhuri
Kaustubh Kulkarni
Swapnil Padhi
Deepu Vijayasenan
Sriram Ganapathy
45
8
0
01 Mar 2023
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Kai-Wei Chang
Yu-Kai Wang
Hua Shen
Iu-thing Kang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
VLM
31
44
0
01 Mar 2023
audb -- Sharing and Versioning of Audio and Annotation Data in Python
audb -- Sharing and Versioning of Audio and Annotation Data in Python
H. Wierstorf
Johannes Wagner
F. Eyben
Felix Burkhardt
Björn W. Schuller
30
1
0
01 Mar 2023
Distance-based Weight Transfer from Near-field to Far-field Speaker
  Verification
Distance-based Weight Transfer from Near-field to Far-field Speaker Verification
Li Zhang
Qing Wang
Hongji Wang
Yue Li
Wei Rao
Yannan Wang
Linfu Xie
31
4
0
01 Mar 2023
Speaker Recognition in Realistic Scenario Using Multimodal Data
Speaker Recognition in Realistic Scenario Using Multimodal Data
Saqlain Hussain Shah
M. S. Saeed
Shah Nawaz
Muhammad Haroon Yousaf
CVBM
26
8
0
25 Feb 2023
Towards multi-task learning of speech and speaker recognition
Towards multi-task learning of speech and speaker recognition
Nik Vaessen
David A. van Leeuwen
CVBM
22
0
0
24 Feb 2023
Supervised Hierarchical Clustering using Graph Neural Networks for
  Speaker Diarization
Supervised Hierarchical Clustering using Graph Neural Networks for Speaker Diarization
Prachi Singh
Amrit Kaul
Sriram Ganapathy
BDL
38
8
0
24 Feb 2023
Amortised Invariance Learning for Contrastive Self-Supervision
Amortised Invariance Learning for Contrastive Self-Supervision
Ruchika Chavhan
Henry Gouk
Jan Stuehmer
Calum Heggan
Mehrdad Yaghoobi
Timothy M. Hospedales
SSL
42
11
0
24 Feb 2023
Catch You and I Can: Revealing Source Voiceprint Against Voice
  Conversion
Catch You and I Can: Revealing Source Voiceprint Against Voice Conversion
Jiangyi Deng
Yanjiao Chen
Yinan Zhong
Qianhao Miao
Xueluan Gong
Wenyuan Xu Zhejiang University
37
8
0
24 Feb 2023
A Framework for Unified Real-time Personalized and Non-Personalized
  Speech Enhancement
A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement
Zhepei Wang
Ritwik Giri
Devansh P. Shah
J. Valin
Mike Goodwin
Paris Smaragdis
27
8
0
23 Feb 2023
Incorporating Uncertainty from Speaker Embedding Estimation to Speaker
  Verification
Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification
Qiongqiong Wang
Kong Aik Lee
Tianchi Liu
UQCV
30
7
0
23 Feb 2023
Cross-modal Audio-visual Co-learning for Text-independent Speaker
  Verification
Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Meng Liu
Kong Aik Lee
Longbiao Wang
Hanyi Zhang
Chang Zeng
J. Dang
23
10
0
22 Feb 2023
Interpretable Spectrum Transformation Attacks to Speaker Recognition
Interpretable Spectrum Transformation Attacks to Speaker Recognition
Jiadi Yao
H. Luo
Xiao-Lei Zhang
AAML
34
1
0
21 Feb 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
23
26
0
20 Feb 2023
Towards Measuring and Scoring Speaker Diarization Fairness
Towards Measuring and Scoring Speaker Diarization Fairness
Yannis Tevissen
Jérôme Boudy
Gérard Chollet
Frédéric Petitpont
23
2
0
20 Feb 2023
Interactive Face Video Coding: A Generative Compression Framework
Interactive Face Video Coding: A Generative Compression Framework
Bo Chen
Zhao Wang
Binzhe Li
Shurun Wang
Shiqi Wang
Yan Ye
VGen
21
16
0
20 Feb 2023
Probabilistic Back-ends for Online Speaker Recognition and Clustering
Probabilistic Back-ends for Online Speaker Recognition and Clustering
A. Sholokhov
Nikita Kuzmin
Kong Aik Lee
Chng Eng Siong
30
1
0
19 Feb 2023
Improving Transformer-based Networks With Locality For Automatic Speaker
  Verification
Improving Transformer-based Networks With Locality For Automatic Speaker Verification
Mufan Sang
Yong Zhao
Gang Liu
John H. L. Hansen
Jian Wu
ViT
33
14
0
17 Feb 2023
3D-aware Blending with Generative NeRFs
3D-aware Blending with Generative NeRFs
Hyunsung Kim
Gayoung Lee
Yunjey Choi
Jin-Hwa Kim
Jun-Yan Zhu
29
12
0
13 Feb 2023
Anti-Compression Contrastive Facial Forgery Detection
Anti-Compression Contrastive Facial Forgery Detection
Jiajun Huang
Xinqi Zhu
Chengbin Du
Siqi Ma
Surya Nepal
Chang Xu
CVBM
32
3
0
13 Feb 2023
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with
  Unsupervised Text Pretraining
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Takaaki Saeki
Soumi Maiti
Xinjian Li
Shinji Watanabe
Shinnosuke Takamichi
Hiroshi Saruwatari
37
18
0
30 Jan 2023
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech
  Recognition: the Arman-AV Dataset
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset
J. Peymanfard
Samin Heydarian
Ali Lashini
Hossein Zeinali
Mohammad Reza Mohammadi
N. Mozayani
37
10
0
21 Jan 2023
The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge
  2022 System Description
The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description
Yannis Tevissen
Jérôme Boudy
Frédéric Petitpont
28
1
0
17 Jan 2023
DPE: Disentanglement of Pose and Expression for General Video Portrait
  Editing
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing
Youxin Pang
Yong Zhang
Weize Quan
Yanbo Fan
Xiaodong Cun
Ying Shan
Dong-ming Yan
VGen
34
34
0
16 Jan 2023
Automated speech- and text-based classification of neuropsychiatric
  conditions in a multidiagnostic setting
Automated speech- and text-based classification of neuropsychiatric conditions in a multidiagnostic setting
L. Hansen
R. Rocca
A. Simonsen
A. Parola
V. Bliksted
...
Dan Bang
Kristian Tylén
Ethan Weed
S. Ostergaard
Riccardo Fusaroli
51
3
0
13 Jan 2023
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Dan Bigioi
Shubhajit Basak
Michał Stypułkowski
Maciej Ziȩba
H. Jordan
R. Mcdonnell
Peter Corcoran
DiffM
VGen
24
35
0
10 Jan 2023
Randomized Quantization: A Generic Augmentation for Data Agnostic
  Self-supervised Learning
Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Huimin Wu
Chenyang Lei
Xiao Sun
Pengju Wang
Qifeng Chen
Kwang-Ting Cheng
Stephen Lin
Zhirong Wu
MQ
38
5
0
19 Dec 2022
A Review of Speech-centric Trustworthy Machine Learning: Privacy,
  Safety, and Fairness
A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness
Tiantian Feng
Rajat Hebbar
Nicholas Mehlman
Xuan Shi
Aditya Kommineni
and Shrikanth Narayanan
48
31
0
18 Dec 2022
Context-aware Fine-tuning of Self-supervised Speech Models
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
32
7
0
16 Dec 2022
MetaPortrait: Identity-Preserving Talking Head Generation with Fast
  Personalized Adaptation
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Bo Zhang
Chenyang Qi
Pan Zhang
Bo Zhang
Hsiang-Tao Wu
Dong Chen
Qifeng Chen
Yong Wang
Fang Wen
31
54
0
15 Dec 2022
PV3D: A 3D Generative Model for Portrait Video Generation
PV3D: A 3D Generative Model for Portrait Video Generation
Eric Xu
Jianfeng Zhang
Jun Hao Liew
Wenqing Zhang
Song Bai
Jiashi Feng
Mike Zheng Shou
VGen
39
20
0
13 Dec 2022
GPU-accelerated Guided Source Separation for Meeting Transcription
GPU-accelerated Guided Source Separation for Meeting Transcription
Desh Raj
Daniel Povey
Sanjeev Khudanpur
28
35
0
10 Dec 2022
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in
  Transformers
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers
Yasheng Sun
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Zhibin Hong
Jingtuo Liu
Errui Ding
Jingdong Wang
Ziwei Liu
Koike Hideki
35
34
0
09 Dec 2022
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion
  Priors
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
Zhentao Yu
Zixin Yin
Deyu Zhou
Duomin Wang
Finn Wong
Baoyuan Wang
DiffM
30
36
0
07 Dec 2022
DREAM: A Dynamic Scheduler for Dynamic Real-time Multi-model ML
  Workloads
DREAM: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads
Seah Kim
Hyoukjun Kwon
Jinook Song
Jihyuck Jo
Yu-Hsin Chen
Liangzhen Lai
Vikas Chandra
AI4TS
34
7
0
07 Dec 2022
Label-free Knowledge Distillation with Contrastive Loss for Light-weight
  Speaker Recognition
Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition
Zhiyuan Peng
Xuanji He
Ke Ding
Tan Lee
Guanglu Wan
17
6
0
06 Dec 2022
Covariance Regularization for Probabilistic Linear Discriminant Analysis
Covariance Regularization for Probabilistic Linear Discriminant Analysis
Zhiyuan Peng
Mingjie Shao
Xuanji He
Xu Li
Tan Lee
Ke Ding
Guanglu Wan
28
1
0
06 Dec 2022
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video
  Editing via Disentangled Video Encoding
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
Gyeongman Kim
Hajin Shim
Hyunsung Kim
Yunjey Choi
Junho Kim
Eunho Yang
DiffM
VGen
39
31
0
06 Dec 2022
Parameter Efficient Transfer Learning for Various Speech Processing
  Tasks
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
24
16
0
06 Dec 2022
Topological Data Analysis for Speech Processing
Topological Data Analysis for Speech Processing
Eduard Tulchinskii
Kristian Kuznetsov
Laida Kushnareva
D. Cherniavskii
S. Barannikov
Irina Piontkovskaya
Sergey I. Nikolenko
Evgeny Burnaev
43
5
0
30 Nov 2022
MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource
  Indian Languages
MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages
Yue Li
Li Zhang
Na Wang
Jie Liu
Linfu Xie
41
0
0
30 Nov 2022
Evaluating and reducing the distance between synthetic and real speech
  distributions
Evaluating and reducing the distance between synthetic and real speech distributions
Christoph Minixhofer
Ondˇrej Klejch
P. Bell
36
8
0
29 Nov 2022
Previous
123...789...202122
Next