v1v2 (latest)

VoxCeleb: a large-scale speaker identification dataset

26 June 2017

Arsha Nagrani

Joon Son Chung

Andrew Zisserman

ArXiv (abs)PDF HTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,111 papers shown

Title
AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection Joseph Roth Sourish Chaudhuri Ondˇrej Klejch Radhika Marvin Andrew C. Gallagher ... S. Ramaswamy Arkadiusz Stopczynski Cordelia Schmid Zhonghua Xi C. Pantofaru 84 145 0 05 Jan 2019
Speech and Speaker Recognition from Raw Waveform with SincNet Mirco Ravanelli Yoshua Bengio 56 30 0 13 Dec 2018
Theoretical Guarantees of Deep Embedding Losses Under Label Noise Nam Le J. Odobez NoLa 23 1 0 06 Dec 2018
TwoStreamVAN: Improving Motion Modeling in Video Generation Ximeng Sun Huijuan Xu Kate Saenko DiffM VGen 61 17 0 03 Dec 2018
Learning Speaker Representations with Mutual Information Mirco Ravanelli Yoshua Bengio SSL DRL 102 91 0 01 Dec 2018
Noise-tolerant Audio-visual Online Person Verification using an Attention-based Neural Network Fusion Suwon Shon Tae-Hyun Oh James R. Glass 59 50 0 27 Nov 2018
Interpretable Convolutional Filters with SincNet Mirco Ravanelli Yoshua Bengio 93 107 0 23 Nov 2018
iQIYI-VID: A Large Dataset for Multi-modal Person Identification Yuanliu Liu Bo Peng Peipei Shi He Yan Yong Zhou ... Tingwei Gao G. Wang Jian Liu Xiangju Lu Danming Xie 77 35 0 19 Nov 2018
Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection Tomi Kinnunen Rosa González Hautamäki Ville Vestman Md. Sahidullah 70 5 0 09 Nov 2018
Who Do I Sound Like? Showcasing Speaker Recognition Technology by YouTube Voice Search R. Krishnan Bilal Soomro Mahesh Subedar Ville Hautamaki Tomi Kinnunen 103 5 0 08 Nov 2018
Gaussian-Constrained training for speaker verification Lantian Li Zhiyuan Tang Ying Shi Dong Wang 58 26 0 08 Nov 2018
Adapting End-to-End Neural Speaker Verification to New Languages and Recording Conditions with Adversarial Training Christoph Dann Lihong Li Wei Wei 86 39 0 07 Nov 2018
Building Corpora for Single-Channel Speech Separation Across Multiple Domains Aman Rana Gregory Sell Leibny Paola García Perera A. Lowe Pratik Shah 64 10 0 06 Nov 2018
How to Improve Your Speaker Embeddings Extractor in Generic Toolkits Christopher Snyder Lukás Burget S. Vishwanath Themos Stafylakis Jan Cernocky 80 51 0 05 Nov 2018
Deep Segment Attentive Embedding for Duration Robust Speaker Verification Bin Liu Shuai Nie Yaping Zhang Shan Liang Wenju Liu 52 4 0 01 Nov 2018
Deep Net Features for Complex Emotion Recognition Bhalaji Nagarajan V. R. M. Oruganti 23 3 0 31 Oct 2018
Deep Learning as Feature Encoding for Emotion Recognition Bhalaji Nagarajan V. R. M. Oruganti 26 1 0 30 Oct 2018
Short utterance compensation in speaker verification via cosine-based teacher-student learning of speaker embeddings Jee-weon Jung Hee-Soo Heo Hye-jin Shim Ha-Jin Yu 78 37 0 25 Oct 2018
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking Quan Wang Hannah Muckenhirn K. Wilson Prashant Sridhar Zelin Wu J. Hershey Rif A. Saurous Ron J. Weiss Ye Jia Ignacio López Moreno 127 370 0 11 Oct 2018
Fully Supervised Speaker Diarization Aonan Zhang Quan Wang Zhenyao Zhu John Paisley Chong-Jun Wang BDL 142 218 0 10 Oct 2018
Attention Mechanism in Speaker Recognition: What Does It Learn in Deep Speaker Embedding? Qiongqiong Wang K. Okabe Kong Aik Lee Hitoshi Yamamoto Takafumi Koshinaka 60 31 0 25 Sep 2018
Unsupervised Representation Learning of Speech for Dialect Identification Suwon Shon Wei-Ning Hsu James R. Glass 43 13 0 12 Sep 2018
Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model Suwon Shon Hao Tang James R. Glass 62 88 0 12 Sep 2018
One-Shot Speaker Identification for a Service Robot using a CNN-based Generic Verifier I. Vélez C. Rascón Gibran Fuentes Pineda 30 7 0 11 Sep 2018
Self-Supervised Generation of Spatial Audio for 360 Video Pedro Morgado Nuno Vasconcelos Timothy R. Langlois Oliver Wang MDE 66 174 0 07 Sep 2018
Self-supervised learning of a facial attribute embedding from video Olivia Wiles A. Sophia Koepke Andrew Zisserman CVBM SSL 86 133 0 21 Aug 2018
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild Samuel Albanie Arsha Nagrani Andrea Vedaldi Andrew Zisserman CVBM 81 272 0 16 Aug 2018
Prosodic-Enhanced Siamese Convolutional Neural Networks for Cross-Device Text-Independent Speaker Verification Sobhan Soleymani Ali Dabouei Seyed Mehdi Iranmanesh Hadi Kazemi J. Dawson Nasser M. Nasrabadi 57 18 0 31 Jul 2018
Speaker Recognition from Raw Waveform with SincNet Mirco Ravanelli Yoshua Bengio 203 724 0 29 Jul 2018
X2Face: A network for controlling face generation by using images, audio, and pose codes Olivia Wiles A. Sophia Koepke Andrew Zisserman CVBM 96 416 0 27 Jul 2018
Unified Hypersphere Embedding for Speaker Recognition Mahdi Hajibabaei Dengxin Dai 73 86 0 22 Jul 2018
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation Hang Zhou Yu Liu Ziwei Liu Ping Luo Xiaogang Wang CVBM 94 443 0 20 Jul 2018
Disjoint Mapping Network for Cross-modal Matching of Voices and Faces Yandong Wen Mahmoud Al Ismail Weiyang Liu Bhiksha Raj Rita Singh FedML 59 71 0 12 Jul 2018
Detection and Analysis of Content Creator Collaborations in YouTube Videos using Face- and Speaker-Recognition Moritz Lode Michael Örtl Christian Koch Amr Rizk R. Steinmetz CVBM 21 1 0 05 Jul 2018
Weakly Supervised Training of Speaker Identification Models Mart Karu Tanel Alumäe 42 10 0 22 Jun 2018
Unsupervised Learning of Object Landmarks through Conditional Image Generation Tomas Jakab Ankush Gupta Hakan Bilen Andrea Vedaldi SSL 105 253 0 20 Jun 2018
VoxCeleb2: Deep Speaker Recognition Joon Son Chung Arsha Nagrani Andrew Zisserman 368 2,289 0 14 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Ye Jia Yu Zhang Ron J. Weiss Quan Wang Jonathan Shen ... Zhiwen Chen Patrick Nguyen Ruoming Pang Ignacio López Moreno Yonghui Wu 270 838 0 12 Jun 2018
Analysis of Length Normalization in End-to-End Speaker Verification System Weicheng Cai Jinkun Chen Ming Li VLM 61 39 0 08 Jun 2018
Speaker Clustering Using Dominant Sets Feliks Hibraj Sebastiano Vascon Thilo Stadelmann Marcello Pelillo 22 4 0 21 May 2018
Sparse Architectures for Text-Independent Speaker Verification Using Deep Neural Networks Sara Sedighi Shayan Ramhormozi 16 0 0 19 May 2018
On Learning Associations of Faces and Voices Changil Kim Hijung Valentina Shin Tae-Hyun Oh Alexandre Kaspar Mohamed A. Elgharib Wojciech Matusik CVBM 90 84 0 15 May 2018
Supervector Compression Strategies to Speed up I-Vector System Development Ville Vestman Tomi Kinnunen 61 3 0 03 May 2018
Learnable PINs: Cross-Modal Embeddings for Person Identity Arsha Nagrani Samuel Albanie Andrew Zisserman SSL 138 141 0 02 May 2018
End-to-End Residual CNN with L-GM Loss Speaker Verification System Xuan Shi Xingjian Du Mengyao Zhu 32 5 0 02 May 2018
A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues Songyou Peng Le Zhang Yutong Ban Mengsha Fang Stefan Winkler 94 25 0 02 May 2018
Text-Independent Speaker Verification Using Long Short-Term Memory Networks Aryan Mobiny Mohammad Najarian 69 16 0 02 May 2018
Collaborations on YouTube: From Unsupervised Detection to the Impact on Video and Channel Popularity Christian Koch Moritz Lode Denny Stohr Amr Rizk R. Steinmetz 11 4 0 01 May 2018
Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System Weicheng Cai Jinkun Chen Ming Li 68 332 0 14 Apr 2018
Talking Face Generation by Conditional Recurrent Adversarial Network Yang Song Jingwen Zhu Dawei Li Xiaolong Wang Hairong Qi CVBM 177 196 0 13 Apr 2018