v1v2 (latest)

VoxCeleb: a large-scale speaker identification dataset

26 June 2017

Arsha Nagrani

Joon Son Chung

Andrew Zisserman

ArXiv (abs)PDF HTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,111 papers shown

Title
Privacy-preserving Automatic Speaker Diarization Francisco Teixeira A. Abad Bhiksha Raj Isabel Trancoso 75 4 0 26 Oct 2022
In search of strong embedding extractors for speaker diarisation Jee-weon Jung Hee-Soo Heo Bong-Jin Lee Jaesung Huh A. Brown Youngki Kwon Shinji Watanabe Joon Son Chung 83 16 0 26 Oct 2022
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge Bowen Pang Huan Zhao Gaosheng Zhang Xiaoyue Yang Yanguo Sun Li Zhang Qing Wang Linfu Xie BDL 52 2 0 26 Oct 2022
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input Daisuke Niizumi Daiki Takeuchi Yasunori Ohishi Noboru Harada K. Kashino SSL 105 33 0 26 Oct 2022
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation Evonne Lee Guangzhi Sun Chuxu Zhang P. Woodland 51 1 0 24 Oct 2022
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation Xiaoyu Liu Xu Li Joan Serrà 87 9 0 23 Oct 2022
Low-Resource Multilingual and Zero-Shot Multispeaker TTS Florian Lux Julia Koch Ngoc Thang Vu 107 23 0 21 Oct 2022
Combining Contrastive and Non-Contrastive Losses for Fine-Tuning Pretrained Models in Speech Analysis Florian Lux Ching-Yi Chen Ngoc Thang Vu 39 1 0 21 Oct 2022
Large-scale learning of generalised representations for speaker recognition Jee-weon Jung Hee-Soo Heo Bong-Jin Lee Jaesong Lee Hye-jin Shim Youngki Kwon Joon Son Chung Shinji Watanabe CVBM 65 6 0 20 Oct 2022
Risk of re-identification for shared clinical speech recordings D. Wiepert B. Malin Joseph James Duffy Rene L. Utianski John L. Stricker David T. Jones Hugo Botha 58 0 0 18 Oct 2022
How to Leverage DNN-based speech enhancement for multi-channel speaker verification? Sandipana Dowerah Romain Serizel D. Jouvet Mohammad MohammadAmini D. Matrouf 82 2 0 17 Oct 2022
Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations Themos Stafylakis Ladislav Mošner Sofoklis Kakouros Oldrich Plchot L. Burget J. Černocký SSL 60 10 0 15 Oct 2022
Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks Run Wang Jixing Ren Boheng Li Tianyi She Wenhui Zhang Liming Fang Jing Chen Chao Shen Lina Wang WIGM 79 19 0 14 Oct 2022
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy Sarina Meyer Pascal Tilli Pavel Denisov Florian Lux Julia Koch Ngoc Thang Vu 85 32 0 13 Oct 2022
Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar Aolan Sun Xulong Zhang Tiandong Ling Jianzong Wang Ning Cheng Jing Xiao 52 4 0 13 Oct 2022
Revisiting Self-Supervised Contrastive Learning for Facial Expression Recognition Yuxuan Shu Xiao Gu Guangyao Yang Benny Lo SSL 107 18 0 08 Oct 2022
Compressing Video Calls using Synthetic Talking Heads Madhav Agarwal Anchit Gupta Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar 49 11 0 07 Oct 2022
A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis Yichen Han Ya Li Yingming Gao Jinlong Xue Songpo Wang Lei Yang 34 2 0 07 Oct 2022
Audio-Visual Face Reenactment Madhav Agarwal Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar DiffM VGen 61 24 0 06 Oct 2022
PSVRF: Learning to restore Pitch-Shifted Voice without reference Yangfu Li Xiaodan Lin Jiaxin Yang 55 0 0 06 Oct 2022
Geometry Driven Progressive Warping for One-Shot Face Animation Yatao Zhong F. Amjadi Ilya Zharkov 3DH CVBM 113 1 0 05 Oct 2022
Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward Awais Khan K. Malik James Ryan Mikul Saravanan AAML 118 15 0 02 Oct 2022
An empirical study of weakly supervised audio tagging embeddings for general audio representations Heinrich Dinkel Zhiyong Yan Yongqing Wang Junbo Zhang Yujun Wang 62 1 0 30 Sep 2022
Motion and Appearance Adaptation for Cross-Domain Motion Transfer Borun Xu Biao Wang Jinhong Deng Jiale Tao T. Ge Yuning Jiang Wen Li Lixin Duan 117 9 0 29 Sep 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks Andrés Vasco-Carofilis Laura Fernández-Robles Enrique Alegre Eduardo FIDALGO 85 3 0 28 Sep 2022
Motion Transformer for Unsupervised Image Animation Jiale Tao Biao Wang T. Ge Yuning Jiang Wen Li Lixin Duan ViT 90 11 0 28 Sep 2022
StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment Stella Bounareli Christos Tzelepis Vasileios Argyriou Ioannis Patras Georgios Tzimiropoulos CVBM 88 18 0 27 Sep 2022
NWPU-ASLP System for the VoicePrivacy 2022 Challenge Jixun Yao Qing Wang Li Zhang Pengcheng Guo Yuhao Liang Linfu Xie PICV 80 17 0 24 Sep 2022
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed Mei-Shuo Chen Z. Duan 105 11 0 23 Sep 2022
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022 Qutang Cai Guoqiang Hong Zhijian Ye Ximin Li Haizhou Li 119 7 0 23 Sep 2022
Gemino: Practical and Robust Neural Compression for Video Conferencing Vibhaalakshmi Sivaraman Pantea Karimi Vedantha Venkatapathy Mehrdad Khani Shirkoohi Sadjad Fouladi M. Alizadeh F. Durand Vivienne Sze 3DH 115 19 0 21 Sep 2022
FNeVR: Neural Volume Rendering for Face Animation Bo-Wen Zeng Bo-Ye Liu Hong Li Xuhui Liu Jianzhuang Liu Dapeng Chen Wei Peng Baochang Zhang CVBM 3DH 121 28 0 21 Sep 2022
Pay Attention to Hard Trials Lantian Li Di Wang Dong Wang 113 1 0 10 Sep 2022
Defend Data Poisoning Attacks on Voice Authentication Ke Li Cameron Baird D. Lin AAML 75 9 0 09 Sep 2022
Joint Speaker Encoder and Neural Back-end Model for Fully End-to-End Automatic Speaker Verification with Multiple Enrollment Utterances Chang Zeng Xiaoxiao Miao Xin Wang Erica Cooper Junichi Yamagishi 69 6 0 01 Sep 2022
Computing with Hypervectors for Efficient Speaker Identification Ping-Chen Huang Denis Kleyko J. Rabaey Bruno A. Olshausen P. Kanerva 81 2 0 28 Aug 2022
Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization Dongmei Wang Xiong Xiao Naoyuki Kanda Takuya Yoshioka Jian Wu 87 29 0 27 Aug 2022
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages Tahir Javed Kaushal Bhogale A. Raman Anoop Kunchukuttan Pratyush Kumar Mitesh M. Khapra ELM 90 26 0 24 Aug 2022
Learning Branched Fusion and Orthogonal Projection for Face-Voice Association M. S. Saeed Shah Nawaz M. H. Khan S. Javed Muhammad Haroon Yousaf Alessio Del Bue CVBM 74 4 0 22 Aug 2022
Learning in Audio-visual Context: A Review, Analysis, and New Perspective Yake Wei Di Hu Yapeng Tian Xuelong Li 135 55 0 20 Aug 2022
Disentangled Speaker Representation Learning via Mutual Information Minimization Sung Hwan Mun Mingrui Han Minchan Kim Dongjune Lee N. Kim DRL 97 11 0 17 Aug 2022
Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment Taewoo Kim Chaeyeon Chung Yoonseong Kim S. Park Kangyeol Kim Jaegul Choo 3DH 69 21 0 16 Aug 2022
FDNeRF: Few-shot Dynamic Neural Radiance Fields for Face Reconstruction and Expression Editing Jingbo Zhang Xiaoyu Li Bo Liu Can Wang Jing Liao 3DH CVBM 138 42 0 11 Aug 2022
Non-Contrastive Self-supervised Learning for Utterance-Level Information Extraction from Speech Jaejin Cho Jesús Villalba Laureano Moro-Velazquez Najim Dehak SSL 90 18 0 10 Aug 2022
Robust Acoustic Domain Identification with its Application to Speaker Diarization Kishore Kumar A Shefali Waldekar Md. Sahidullah G. Saha 52 0 0 05 Aug 2022
Attention and DCT based Global Context Modeling for Text-independent Speaker Recognition Wei Xia John H. L. Hansen 65 4 0 04 Aug 2022
Free-HeadGAN: Neural Talking Head Synthesis with Explicit Gaze Control M. Doukas Evangelos Ververas V. Sharmanska Stefanos Zafeiriou CVBM 70 15 0 03 Aug 2022
The SJTU System for Short-duration Speaker Verification Challenge 2021 Bing Han Zhengyang Chen Zhikai Zhou Y. Qian 19 7 0 03 Aug 2022
Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction Bing Han Zhengyang Chen Y. Qian 61 32 0 03 Aug 2022
End-To-End Audiovisual Feature Fusion for Active Speaker Detection Fiseha B. Tesema Zheyuan Lin Shiqiang Zhu Wei Song J. Gu Hong-Chuan Wu 42 4 0 27 Jul 2022