Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.08612
Cited By
VoxCeleb: a large-scale speaker identification dataset
26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VoxCeleb: a large-scale speaker identification dataset"
50 / 1,100 papers shown
Title
A Study on Bias and Fairness In Deep Speaker Recognition
Amirhossein Hajavi
Ali Etemad
27
2
0
14 Mar 2023
Single-branch Network for Multimodal Training
M. S. Saeed
Shah Nawaz
M. H. Khan
M. Zaheer
Karthik Nandakumar
Muhammad Haroon Yousaf
Arif Mahmood
19
12
0
10 Mar 2023
Self-supervised Facial Action Unit Detection with Region and Relation Learning
Juan Song
Zhilei Liu
ViT
23
1
0
10 Mar 2023
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
Ashish Seth
Sreyan Ghosh
S. Umesh
Dinesh Manocha
27
0
0
10 Mar 2023
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning
Ruize Xu
Ruoxuan Feng
Shi-Xiong Zhang
Di Hu
39
21
0
09 Mar 2023
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
24
3
0
09 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
47
3
0
07 Mar 2023
Distilling Multi-Level X-vector Knowledge for Small-footprint Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
42
4
0
02 Mar 2023
DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Sidharth Sidharth
Ranjana H
Prachi Singh
...
Pratik Roy Chowdhuri
Kaustubh Kulkarni
Swapnil Padhi
Deepu Vijayasenan
Sriram Ganapathy
45
8
0
01 Mar 2023
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks
Kai-Wei Chang
Yu-Kai Wang
Hua Shen
Iu-thing Kang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
VLM
31
44
0
01 Mar 2023
audb -- Sharing and Versioning of Audio and Annotation Data in Python
H. Wierstorf
Johannes Wagner
F. Eyben
Felix Burkhardt
Björn W. Schuller
30
1
0
01 Mar 2023
Distance-based Weight Transfer from Near-field to Far-field Speaker Verification
Li Zhang
Qing Wang
Hongji Wang
Yue Li
Wei Rao
Yannan Wang
Linfu Xie
31
4
0
01 Mar 2023
Speaker Recognition in Realistic Scenario Using Multimodal Data
Saqlain Hussain Shah
M. S. Saeed
Shah Nawaz
Muhammad Haroon Yousaf
CVBM
26
8
0
25 Feb 2023
Towards multi-task learning of speech and speaker recognition
Nik Vaessen
David A. van Leeuwen
CVBM
22
0
0
24 Feb 2023
Supervised Hierarchical Clustering using Graph Neural Networks for Speaker Diarization
Prachi Singh
Amrit Kaul
Sriram Ganapathy
BDL
38
8
0
24 Feb 2023
Amortised Invariance Learning for Contrastive Self-Supervision
Ruchika Chavhan
Henry Gouk
Jan Stuehmer
Calum Heggan
Mehrdad Yaghoobi
Timothy M. Hospedales
SSL
42
11
0
24 Feb 2023
Catch You and I Can: Revealing Source Voiceprint Against Voice Conversion
Jiangyi Deng
Yanjiao Chen
Yinan Zhong
Qianhao Miao
Xueluan Gong
Wenyuan Xu Zhejiang University
37
8
0
24 Feb 2023
A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement
Zhepei Wang
Ritwik Giri
Devansh P. Shah
J. Valin
Mike Goodwin
Paris Smaragdis
27
8
0
23 Feb 2023
Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification
Qiongqiong Wang
Kong Aik Lee
Tianchi Liu
UQCV
30
7
0
23 Feb 2023
Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Meng Liu
Kong Aik Lee
Longbiao Wang
Hanyi Zhang
Chang Zeng
J. Dang
23
10
0
22 Feb 2023
Interpretable Spectrum Transformation Attacks to Speaker Recognition
Jiadi Yao
H. Luo
Xiao-Lei Zhang
AAML
34
1
0
21 Feb 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
23
26
0
20 Feb 2023
Towards Measuring and Scoring Speaker Diarization Fairness
Yannis Tevissen
Jérôme Boudy
Gérard Chollet
Frédéric Petitpont
23
2
0
20 Feb 2023
Interactive Face Video Coding: A Generative Compression Framework
Bo Chen
Zhao Wang
Binzhe Li
Shurun Wang
Shiqi Wang
Yan Ye
VGen
21
16
0
20 Feb 2023
Probabilistic Back-ends for Online Speaker Recognition and Clustering
A. Sholokhov
Nikita Kuzmin
Kong Aik Lee
Chng Eng Siong
30
1
0
19 Feb 2023
Improving Transformer-based Networks With Locality For Automatic Speaker Verification
Mufan Sang
Yong Zhao
Gang Liu
John H. L. Hansen
Jian Wu
ViT
33
14
0
17 Feb 2023
3D-aware Blending with Generative NeRFs
Hyunsung Kim
Gayoung Lee
Yunjey Choi
Jin-Hwa Kim
Jun-Yan Zhu
29
12
0
13 Feb 2023
Anti-Compression Contrastive Facial Forgery Detection
Jiajun Huang
Xinqi Zhu
Chengbin Du
Siqi Ma
Surya Nepal
Chang Xu
CVBM
32
3
0
13 Feb 2023
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Takaaki Saeki
Soumi Maiti
Xinjian Li
Shinji Watanabe
Shinnosuke Takamichi
Hiroshi Saruwatari
37
18
0
30 Jan 2023
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset
J. Peymanfard
Samin Heydarian
Ali Lashini
Hossein Zeinali
Mohammad Reza Mohammadi
N. Mozayani
37
10
0
21 Jan 2023
The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description
Yannis Tevissen
Jérôme Boudy
Frédéric Petitpont
28
1
0
17 Jan 2023
DPE: Disentanglement of Pose and Expression for General Video Portrait Editing
Youxin Pang
Yong Zhang
Weize Quan
Yanbo Fan
Xiaodong Cun
Ying Shan
Dong-ming Yan
VGen
34
34
0
16 Jan 2023
Automated speech- and text-based classification of neuropsychiatric conditions in a multidiagnostic setting
L. Hansen
R. Rocca
A. Simonsen
A. Parola
V. Bliksted
...
Dan Bang
Kristian Tylén
Ethan Weed
S. Ostergaard
Riccardo Fusaroli
51
3
0
13 Jan 2023
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Dan Bigioi
Shubhajit Basak
Michał Stypułkowski
Maciej Ziȩba
H. Jordan
R. Mcdonnell
Peter Corcoran
DiffM
VGen
24
35
0
10 Jan 2023
Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Huimin Wu
Chenyang Lei
Xiao Sun
Pengju Wang
Qifeng Chen
Kwang-Ting Cheng
Stephen Lin
Zhirong Wu
MQ
38
5
0
19 Dec 2022
A Review of Speech-centric Trustworthy Machine Learning: Privacy, Safety, and Fairness
Tiantian Feng
Rajat Hebbar
Nicholas Mehlman
Xuan Shi
Aditya Kommineni
and Shrikanth Narayanan
48
31
0
18 Dec 2022
Context-aware Fine-tuning of Self-supervised Speech Models
Suwon Shon
Felix Wu
Kwangyoun Kim
Prashant Sridhar
Karen Livescu
Shinji Watanabe
32
7
0
16 Dec 2022
MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Bo Zhang
Chenyang Qi
Pan Zhang
Bo Zhang
Hsiang-Tao Wu
Dong Chen
Qifeng Chen
Yong Wang
Fang Wen
31
54
0
15 Dec 2022
PV3D: A 3D Generative Model for Portrait Video Generation
Eric Xu
Jianfeng Zhang
Jun Hao Liew
Wenqing Zhang
Song Bai
Jiashi Feng
Mike Zheng Shou
VGen
39
20
0
13 Dec 2022
GPU-accelerated Guided Source Separation for Meeting Transcription
Desh Raj
Daniel Povey
Sanjeev Khudanpur
28
35
0
10 Dec 2022
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers
Yasheng Sun
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Zhibin Hong
Jingtuo Liu
Errui Ding
Jingdong Wang
Ziwei Liu
Koike Hideki
35
34
0
09 Dec 2022
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
Zhentao Yu
Zixin Yin
Deyu Zhou
Duomin Wang
Finn Wong
Baoyuan Wang
DiffM
30
36
0
07 Dec 2022
DREAM: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads
Seah Kim
Hyoukjun Kwon
Jinook Song
Jihyuck Jo
Yu-Hsin Chen
Liangzhen Lai
Vikas Chandra
AI4TS
34
7
0
07 Dec 2022
Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition
Zhiyuan Peng
Xuanji He
Ke Ding
Tan Lee
Guanglu Wan
17
6
0
06 Dec 2022
Covariance Regularization for Probabilistic Linear Discriminant Analysis
Zhiyuan Peng
Mingjie Shao
Xuanji He
Xu Li
Tan Lee
Ke Ding
Guanglu Wan
28
1
0
06 Dec 2022
Diffusion Video Autoencoders: Toward Temporally Consistent Face Video Editing via Disentangled Video Encoding
Gyeongman Kim
Hajin Shim
Hyunsung Kim
Yunjey Choi
Junho Kim
Eunho Yang
DiffM
VGen
39
31
0
06 Dec 2022
Parameter Efficient Transfer Learning for Various Speech Processing Tasks
Shinta Otake
Rei Kawakami
Nakamasa Inoue
24
16
0
06 Dec 2022
Topological Data Analysis for Speech Processing
Eduard Tulchinskii
Kristian Kuznetsov
Laida Kushnareva
D. Cherniavskii
S. Barannikov
Irina Piontkovskaya
Sergey I. Nikolenko
Evgeny Burnaev
43
5
0
30 Nov 2022
MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages
Yue Li
Li Zhang
Na Wang
Jie Liu
Linfu Xie
41
0
0
30 Nov 2022
Evaluating and reducing the distance between synthetic and real speech distributions
Christoph Minixhofer
Ondˇrej Klejch
P. Bell
36
8
0
29 Nov 2022
Previous
1
2
3
...
7
8
9
...
20
21
22
Next