Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.08612
Cited By
v1
v2 (latest)
VoxCeleb: a large-scale speaker identification dataset
26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VoxCeleb: a large-scale speaker identification dataset"
50 / 1,111 papers shown
Title
Face Animation with an Attribute-Guided Diffusion Model
Bo-Wen Zeng
Xuhui Liu
Sicheng Gao
Boyu Liu
Hong Li
Jianzhuang Liu
Baochang Zhang
86
33
0
06 Apr 2023
StyleGAN Salon: Multi-View Latent Optimization for Pose-Invariant Hairstyle Transfer
Sasikarn Khwanmuang
Pakkapon Phongthawee
Patsorn Sangkloy
Supasorn Suwajanakorn
3DH
67
8
0
05 Apr 2023
AutoAD: Movie Description in Context
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
77
35
0
29 Mar 2023
VIVE3D: Viewpoint-Independent Video Editing using 3D-Aware GANs
Anna Frühstück
N. Sarafianos
Yuanlu Xu
Peter Wonka
Tony Tung
85
20
0
28 Mar 2023
RobustSwap: A Simple yet Robust Face Swapping Model against Attribute Leakage
Jaeseong Lee
Taewoo Kim
S. Park
Younggun Lee
Jaegul Choo
CVBM
105
2
0
28 Mar 2023
A Universal Identity Backdoor Attack against Speaker Verification based on Siamese Network
Haodong Zhao
Wei Du
Junjie Guo
Gongshen Liu
AAML
87
0
0
28 Mar 2023
CelebV-Text: A Large-Scale Facial Text-Video Dataset
Jianhui Yu
Hao Zhu
Liming Jiang
Chen Change Loy
Weidong (Tom) Cai
Wayne Wu
77
62
0
26 Mar 2023
DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter for Speaker Verification
Yangfu Li
Jiapan Gan
Xiaodan Lin
53
6
0
20 Mar 2023
Right the docs: Characterising voice dataset documentation practices used in machine learning
Kathy Reid
Elizabeth T. Williams
66
2
0
19 Mar 2023
The Graph feature fusion technique for speaker recognition based on wav2vec2.0 framework
Zirui Ge
Haiyan Guo
Zhen Yang
65
1
0
19 Mar 2023
MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D Face Animation
Haozhe Wu
Jia Jia
Junliang Xing
Hongwei Xu
Xiangyuan Wang
Jelo Wang
CVBM
83
7
0
17 Mar 2023
Enhancing Unsupervised Audio Representation Learning via Adversarial Sample Generation
Yulin Pan
Xiangteng He
Biao Gong
Yuxin Peng
Yiliang Lv
SSL
51
0
0
15 Mar 2023
A Study on Bias and Fairness In Deep Speaker Recognition
Amirhossein Hajavi
Ali Etemad
59
2
0
14 Mar 2023
Single-branch Network for Multimodal Training
M. S. Saeed
Shah Nawaz
M. H. Khan
M. Zaheer
Karthik Nandakumar
Muhammad Haroon Yousaf
Arif Mahmood
42
13
0
10 Mar 2023
Self-supervised Facial Action Unit Detection with Region and Relation Learning
Juan Song
Zhilei Liu
ViT
40
1
0
10 Mar 2023
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
Ashish Seth
Sreyan Ghosh
S. Umesh
Dinesh Manocha
58
0
0
10 Mar 2023
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Ruize Xu
Ruoxuan Feng
Shi-Xiong Zhang
Di Hu
80
24
0
09 Mar 2023
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
51
3
0
09 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
99
3
0
07 Mar 2023
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
130
4
0
02 Mar 2023
DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Sidharth Sidharth
Ranjana H
Prachi Singh
...
Pratik Roy Chowdhuri
Kaustubh Kulkarni
Swapnil Padhi
Deepu Vijayasenan
Sriram Ganapathy
80
9
0
01 Mar 2023
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Kai-Wei Chang
Yu-Kai Wang
Hua Shen
Iu-thing Kang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
VLM
88
46
0
01 Mar 2023
audb -- Sharing and Versioning of Audio and Annotation Data in Python
H. Wierstorf
Johannes Wagner
F. Eyben
Felix Burkhardt
Björn W. Schuller
70
1
0
01 Mar 2023
Distance-based Weight Transfer from Near-field to Far-field Speaker Verification
Li Zhang
Qing Wang
Hongji Wang
Yue Li
Wei Rao
Yannan Wang
Linfu Xie
58
4
0
01 Mar 2023
Speaker Recognition in Realistic Scenario Using Multimodal Data
Saqlain Hussain Shah
M. S. Saeed
Shah Nawaz
Muhammad Haroon Yousaf
CVBM
79
9
0
25 Feb 2023
Towards multi-task learning of speech and speaker recognition
Nik Vaessen
David A. van Leeuwen
CVBM
24
0
0
24 Feb 2023
Supervised Hierarchical Clustering using Graph Neural Networks for Speaker Diarization
Prachi Singh
Amrit Kaul
Sriram Ganapathy
BDL
64
8
0
24 Feb 2023
Amortised Invariance Learning for Contrastive Self-Supervision
Ruchika Chavhan
Henry Gouk
Jan Stuehmer
Calum Heggan
Mehrdad Yaghoobi
Timothy M. Hospedales
SSL
88
11
0
24 Feb 2023
Catch You and I Can: Revealing Source Voiceprint Against Voice Conversion
Jiangyi Deng
Yanjiao Chen
Yinan Zhong
Qianhao Miao
Xueluan Gong
Wenyuan Xu Zhejiang University
99
8
0
24 Feb 2023
A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement
Zhepei Wang
Ritwik Giri
Devansh P. Shah
J. Valin
Mike Goodwin
Paris Smaragdis
72
9
0
23 Feb 2023
Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification
Qiongqiong Wang
Kong Aik Lee
Tianchi Liu
UQCV
73
7
0
23 Feb 2023
Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Meng Liu
Kong Aik Lee
Longbiao Wang
Hanyi Zhang
Chang Zeng
Jianwu Dang
83
10
0
22 Feb 2023
Interpretable Spectrum Transformation Attacks to Speaker Recognition
Jiadi Yao
H. Luo
Xiao-Lei Zhang
AAML
61
2
0
21 Feb 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
106
26
0
20 Feb 2023
Towards Measuring and Scoring Speaker Diarization Fairness
Yannis Tevissen
Jérôme Boudy
Gérard Chollet
Frédéric Petitpont
99
2
0
20 Feb 2023
Interactive Face Video Coding: A Generative Compression Framework
Bo Chen
Zhao Wang
Binzhe Li
Shurun Wang
Shiqi Wang
Yan Ye
VGen
71
19
0
20 Feb 2023
Probabilistic Back-ends for Online Speaker Recognition and Clustering
A. Sholokhov
Nikita Kuzmin
Kong Aik Lee
Chng Eng Siong
63
1
0
19 Feb 2023
Improving Transformer-based Networks With Locality For Automatic Speaker Verification
Mufan Sang
Yong Zhao
Gang Liu
John H. L. Hansen
Jian Wu
ViT
81
15
0
17 Feb 2023
3D-aware Blending with Generative NeRFs
Hyunsung Kim
Gayoung Lee
Yunjey Choi
Jin-Hwa Kim
Jun-Yan Zhu
101
12
0
13 Feb 2023
Anti-Compression Contrastive Facial Forgery Detection
Jiajun Huang
Xinqi Zhu
Chengbin Du
Siqi Ma
Surya Nepal
Chang Xu
CVBM
71
4
0
13 Feb 2023
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Takaaki Saeki
Soumi Maiti
Xinjian Li
Shinji Watanabe
Shinnosuke Takamichi
Hiroshi Saruwatari
109
18
0
30 Jan 2023
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset
J. Peymanfard
Samin Heydarian
Ali Lashini
Hossein Zeinali
Mohammad Reza Mohammadi
N. Mozayani
76
11
0
21 Jan 2023
The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description
Yannis Tevissen
Jérôme Boudy
Frédéric Petitpont
86
1
0
17 Jan 2023
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing
Youxin Pang
Yong Zhang
Weize Quan
Yanbo Fan
Xiaodong Cun
Ying Shan
Dong-ming Yan
VGen
85
37
0
16 Jan 2023
Automated speech- and text-based classification of neuropsychiatric conditions in a multidiagnostic setting
L. Hansen
R. Rocca
A. Simonsen
A. Parola
V. Bliksted
...
Dan Bang
Kristian Tylén
Ethan Weed
S. Ostergaard
Riccardo Fusaroli
102
3
0
13 Jan 2023
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Dan Bigioi
Shubhajit Basak
Michał Stypułkowski
Maciej Ziȩba
H. Jordan
R. Mcdonnell
Peter Corcoran
DiffM
VGen
106
36
0
10 Jan 2023
Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Huimin Wu
Chenyang Lei
Xiao Sun
Pengju Wang
Qifeng Chen
Kwang-Ting Cheng
Stephen Lin
Zhirong Wu
MQ
86
5
0
19 Dec 2022
A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness
Tiantian Feng
Rajat Hebbar
Nicholas Mehlman
Xuan Shi
Aditya Kommineni
and Shrikanth Narayanan
108
34
0
18 Dec 2022
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
74
9
0
16 Dec 2022
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Bo Zhang
Chenyang Qi
Pan Zhang
Bo Zhang
Hsiang-Tao Wu
Dong Chen
Qifeng Chen
Yong Wang
Fang Wen
117
59
0
15 Dec 2022
Previous
1
2
3
...
7
8
9
...
21
22
23
Next