ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset
v1v2 (latest)

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,111 papers shown
Title
Face Animation with an Attribute-Guided Diffusion Model
Face Animation with an Attribute-Guided Diffusion Model
Bo-Wen Zeng
Xuhui Liu
Sicheng Gao
Boyu Liu
Hong Li
Jianzhuang Liu
Baochang Zhang
86
33
0
06 Apr 2023
StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant
  Hairstyle Transfer
StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer
Sasikarn Khwanmuang
Pakkapon Phongthawee
Patsorn Sangkloy
Supasorn Suwajanakorn
3DH
67
8
0
05 Apr 2023
AutoAD: Movie Description in Context
AutoAD: Movie Description in Context
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
77
35
0
29 Mar 2023
VIVE3D: Viewpoint-Independent Video Editing using 3D-Aware GANs
VIVE3D: Viewpoint-Independent Video Editing using 3D-Aware GANs
Anna Frühstück
N. Sarafianos
Yuanlu Xu
Peter Wonka
Tony Tung
85
20
0
28 Mar 2023
RobustSwap: A Simple yet Robust Face Swapping Model against Attribute
  Leakage
RobustSwap: A Simple yet Robust Face Swapping Model against Attribute Leakage
Jaeseong Lee
Taewoo Kim
S. Park
Younggun Lee
Jaegul Choo
CVBM
105
2
0
28 Mar 2023
A Universal Identity Backdoor Attack against Speaker Verification based on Siamese Network
A Universal Identity Backdoor Attack against Speaker Verification based on Siamese Network
Haodong Zhao
Wei Du
Junjie Guo
Gongshen Liu
AAML
87
0
0
28 Mar 2023
CelebV-Text: A Large-Scale Facial Text-Video Dataset
CelebV-Text: A Large-Scale Facial Text-Video Dataset
Jianhui Yu
Hao Zhu
Liming Jiang
Chen Change Loy
Weidong (Tom) Cai
Wayne Wu
77
62
0
26 Mar 2023
DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter
  for Speaker Verification
DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter for Speaker Verification
Yangfu Li
Jiapan Gan
Xiaodan Lin
53
6
0
20 Mar 2023
Right the docs: Characterising voice dataset documentation practices
  used in machine learning
Right the docs: Characterising voice dataset documentation practices used in machine learning
Kathy Reid
Elizabeth T. Williams
66
2
0
19 Mar 2023
The Graph feature fusion technique for speaker recognition based on
  wav2vec2.0 framework
The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework
Zirui Ge
Haiyan Guo
Zhen Yang
65
1
0
19 Mar 2023
MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D
  Face Animation
MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D Face Animation
Haozhe Wu
Jia Jia
Junliang Xing
Hongwei Xu
Xiangyuan Wang
Jelo Wang
CVBM
83
7
0
17 Mar 2023
Enhancing Unsupervised Audio Representation Learning via Adversarial
  Sample Generation
Enhancing Unsupervised Audio Representation Learning via Adversarial Sample Generation
Yulin Pan
Xiangteng He
Biao Gong
Yuxin Peng
Yiliang Lv
SSL
51
0
0
15 Mar 2023
A Study on Bias and Fairness In Deep Speaker Recognition
A Study on Bias and Fairness In Deep Speaker Recognition
Amirhossein Hajavi
Ali Etemad
59
2
0
14 Mar 2023
Single-branch Network for Multimodal Training
Single-branch Network for Multimodal Training
M. S. Saeed
Shah Nawaz
M. H. Khan
M. Zaheer
Karthik Nandakumar
Muhammad Haroon Yousaf
Arif Mahmood
42
13
0
10 Mar 2023
Self-supervised Facial Action Unit Detection with Region and Relation
  Learning
Self-supervised Facial Action Unit Detection with Region and Relation Learning
Juan Song
Zhilei Liu
ViT
40
1
0
10 Mar 2023
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
Ashish Seth
Sreyan Ghosh
S. Umesh
Dinesh Manocha
58
0
0
10 Mar 2023
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual
  Fine-Grained Learning
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Ruize Xu
Ruoxuan Feng
Shi-Xiong Zhang
Di Hu
80
24
0
09 Mar 2023
WASD: A Wilder Active Speaker Detection Dataset
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
51
3
0
09 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature
  Diversity and Decorrelation
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
99
3
0
07 Mar 2023
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker
  Verification
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
130
4
0
02 Mar 2023
DISPLACE Challenge: DIarization of SPeaker and LAnguage in
  Conversational Environments
DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Sidharth Sidharth
Ranjana H
Prachi Singh
...
Pratik Roy Chowdhuri
Kaustubh Kulkarni
Swapnil Padhi
Deepu Vijayasenan
Sriram Ganapathy
80
9
0
01 Mar 2023
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Kai-Wei Chang
Yu-Kai Wang
Hua Shen
Iu-thing Kang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
VLM
88
46
0
01 Mar 2023
audb -- Sharing and Versioning of Audio and Annotation Data in Python
audb -- Sharing and Versioning of Audio and Annotation Data in Python
H. Wierstorf
Johannes Wagner
F. Eyben
Felix Burkhardt
Björn W. Schuller
70
1
0
01 Mar 2023
Distance-based Weight Transfer from Near-field to Far-field Speaker
  Verification
Distance-based Weight Transfer from Near-field to Far-field Speaker Verification
Li Zhang
Qing Wang
Hongji Wang
Yue Li
Wei Rao
Yannan Wang
Linfu Xie
58
4
0
01 Mar 2023
Speaker Recognition in Realistic Scenario Using Multimodal Data
Speaker Recognition in Realistic Scenario Using Multimodal Data
Saqlain Hussain Shah
M. S. Saeed
Shah Nawaz
Muhammad Haroon Yousaf
CVBM
79
9
0
25 Feb 2023
Towards multi-task learning of speech and speaker recognition
Towards multi-task learning of speech and speaker recognition
Nik Vaessen
David A. van Leeuwen
CVBM
24
0
0
24 Feb 2023
Supervised Hierarchical Clustering using Graph Neural Networks for
  Speaker Diarization
Supervised Hierarchical Clustering using Graph Neural Networks for Speaker Diarization
Prachi Singh
Amrit Kaul
Sriram Ganapathy
BDL
64
8
0
24 Feb 2023
Amortised Invariance Learning for Contrastive Self-Supervision
Amortised Invariance Learning for Contrastive Self-Supervision
Ruchika Chavhan
Henry Gouk
Jan Stuehmer
Calum Heggan
Mehrdad Yaghoobi
Timothy M. Hospedales
SSL
88
11
0
24 Feb 2023
Catch You and I Can: Revealing Source Voiceprint Against Voice
  Conversion
Catch You and I Can: Revealing Source Voiceprint Against Voice Conversion
Jiangyi Deng
Yanjiao Chen
Yinan Zhong
Qianhao Miao
Xueluan Gong
Wenyuan Xu Zhejiang University
99
8
0
24 Feb 2023
A Framework for Unified Real-time Personalized and Non-Personalized
  Speech Enhancement
A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement
Zhepei Wang
Ritwik Giri
Devansh P. Shah
J. Valin
Mike Goodwin
Paris Smaragdis
72
9
0
23 Feb 2023
Incorporating Uncertainty from Speaker Embedding Estimation to Speaker
  Verification
Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification
Qiongqiong Wang
Kong Aik Lee
Tianchi Liu
UQCV
73
7
0
23 Feb 2023
Cross-modal Audio-visual Co-learning for Text-independent Speaker
  Verification
Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Meng Liu
Kong Aik Lee
Longbiao Wang
Hanyi Zhang
Chang Zeng
Jianwu Dang
83
10
0
22 Feb 2023
Interpretable Spectrum Transformation Attacks to Speaker Recognition
Interpretable Spectrum Transformation Attacks to Speaker Recognition
Jiadi Yao
H. Luo
Xiao-Lei Zhang
AAML
61
2
0
21 Feb 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
106
26
0
20 Feb 2023
Towards Measuring and Scoring Speaker Diarization Fairness
Towards Measuring and Scoring Speaker Diarization Fairness
Yannis Tevissen
Jérôme Boudy
Gérard Chollet
Frédéric Petitpont
99
2
0
20 Feb 2023
Interactive Face Video Coding: A Generative Compression Framework
Interactive Face Video Coding: A Generative Compression Framework
Bo Chen
Zhao Wang
Binzhe Li
Shurun Wang
Shiqi Wang
Yan Ye
VGen
71
19
0
20 Feb 2023
Probabilistic Back-ends for Online Speaker Recognition and Clustering
Probabilistic Back-ends for Online Speaker Recognition and Clustering
A. Sholokhov
Nikita Kuzmin
Kong Aik Lee
Chng Eng Siong
63
1
0
19 Feb 2023
Improving Transformer-based Networks With Locality For Automatic Speaker
  Verification
Improving Transformer-based Networks With Locality For Automatic Speaker Verification
Mufan Sang
Yong Zhao
Gang Liu
John H. L. Hansen
Jian Wu
ViT
81
15
0
17 Feb 2023
3D-aware Blending with Generative NeRFs
3D-aware Blending with Generative NeRFs
Hyunsung Kim
Gayoung Lee
Yunjey Choi
Jin-Hwa Kim
Jun-Yan Zhu
101
12
0
13 Feb 2023
Anti-Compression Contrastive Facial Forgery Detection
Anti-Compression Contrastive Facial Forgery Detection
Jiajun Huang
Xinqi Zhu
Chengbin Du
Siqi Ma
Surya Nepal
Chang Xu
CVBM
71
4
0
13 Feb 2023
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with
  Unsupervised Text Pretraining
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Takaaki Saeki
Soumi Maiti
Xinjian Li
Shinji Watanabe
Shinnosuke Takamichi
Hiroshi Saruwatari
109
18
0
30 Jan 2023
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech
  Recognition: the Arman-AV Dataset
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset
J. Peymanfard
Samin Heydarian
Ali Lashini
Hossein Zeinali
Mohammad Reza Mohammadi
N. Mozayani
76
11
0
21 Jan 2023
The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge
  2022 System Description
The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description
Yannis Tevissen
Jérôme Boudy
Frédéric Petitpont
86
1
0
17 Jan 2023
DPE: Disentanglement of Pose and Expression for General Video Portrait
  Editing
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing
Youxin Pang
Yong Zhang
Weize Quan
Yanbo Fan
Xiaodong Cun
Ying Shan
Dong-ming Yan
VGen
85
37
0
16 Jan 2023
Automated speech- and text-based classification of neuropsychiatric
  conditions in a multidiagnostic setting
Automated speech- and text-based classification of neuropsychiatric conditions in a multidiagnostic setting
L. Hansen
R. Rocca
A. Simonsen
A. Parola
V. Bliksted
...
Dan Bang
Kristian Tylén
Ethan Weed
S. Ostergaard
Riccardo Fusaroli
102
3
0
13 Jan 2023
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Dan Bigioi
Shubhajit Basak
Michał Stypułkowski
Maciej Ziȩba
H. Jordan
R. Mcdonnell
Peter Corcoran
DiffMVGen
106
36
0
10 Jan 2023
Randomized Quantization: A Generic Augmentation for Data Agnostic
  Self-supervised Learning
Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Huimin Wu
Chenyang Lei
Xiao Sun
Pengju Wang
Qifeng Chen
Kwang-Ting Cheng
Stephen Lin
Zhirong Wu
MQ
86
5
0
19 Dec 2022
A Review of Speech-centric Trustworthy Machine Learning: Privacy,
  Safety, and Fairness
A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness
Tiantian Feng
Rajat Hebbar
Nicholas Mehlman
Xuan Shi
Aditya Kommineni
and Shrikanth Narayanan
108
34
0
18 Dec 2022
Context-aware Fine-tuning of Self-supervised Speech Models
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
74
9
0
16 Dec 2022
MetaPortrait: Identity-Preserving Talking Head Generation with Fast
  Personalized Adaptation
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Bo Zhang
Chenyang Qi
Pan Zhang
Bo Zhang
Hsiang-Tao Wu
Dong Chen
Qifeng Chen
Yong Wang
Fang Wen
117
59
0
15 Dec 2022
Previous
123...789...212223
Next