ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXivPDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

48 / 1,098 papers shown
Title
Deep Segment Attentive Embedding for Duration Robust Speaker
  Verification
Deep Segment Attentive Embedding for Duration Robust Speaker Verification
Bin Liu
Shuai Nie
Yaping Zhang
Shan Liang
Wenju Liu
21
4
0
01 Nov 2018
Deep Net Features for Complex Emotion Recognition
Bhalaji Nagarajan
V. R. M. Oruganti
16
3
0
31 Oct 2018
Deep Learning as Feature Encoding for Emotion Recognition
Bhalaji Nagarajan
V. R. M. Oruganti
14
1
0
30 Oct 2018
Short utterance compensation in speaker verification via cosine-based
  teacher-student learning of speaker embeddings
Short utterance compensation in speaker verification via cosine-based teacher-student learning of speaker embeddings
Jee-weon Jung
Hee-Soo Heo
Hye-jin Shim
Ha-Jin Yu
18
36
0
25 Oct 2018
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned
  Spectrogram Masking
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Quan Wang
Hannah Muckenhirn
K. Wilson
Prashant Sridhar
Zelin Wu
J. Hershey
Rif A. Saurous
Ron J. Weiss
Ye Jia
Ignacio López Moreno
11
368
0
11 Oct 2018
Fully Supervised Speaker Diarization
Fully Supervised Speaker Diarization
Aonan Zhang
Quan Wang
Zhenyao Zhu
John Paisley
Chong-Jun Wang
BDL
16
217
0
10 Oct 2018
Attention Mechanism in Speaker Recognition: What Does It Learn in Deep
  Speaker Embedding?
Attention Mechanism in Speaker Recognition: What Does It Learn in Deep Speaker Embedding?
Qiongqiong Wang
K. Okabe
Kong Aik Lee
Hitoshi Yamamoto
Takafumi Koshinaka
22
31
0
25 Sep 2018
Unsupervised Representation Learning of Speech for Dialect
  Identification
Unsupervised Representation Learning of Speech for Dialect Identification
Suwon Shon
Wei-Ning Hsu
James R. Glass
12
13
0
12 Sep 2018
Frame-level speaker embeddings for text-independent speaker recognition
  and analysis of end-to-end model
Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model
Suwon Shon
Hao Tang
James R. Glass
11
88
0
12 Sep 2018
One-Shot Speaker Identification for a Service Robot using a CNN-based
  Generic Verifier
One-Shot Speaker Identification for a Service Robot using a CNN-based Generic Verifier
I. Vélez
C. Rascón
Gibran Fuentes Pineda
15
7
0
11 Sep 2018
Self-Supervised Generation of Spatial Audio for 360 Video
Self-Supervised Generation of Spatial Audio for 360 Video
Pedro Morgado
Nuno Vasconcelos
Timothy R. Langlois
Oliver Wang
MDE
24
171
0
07 Sep 2018
Self-supervised learning of a facial attribute embedding from video
Self-supervised learning of a facial attribute embedding from video
Olivia Wiles
A. Sophia Koepke
Andrew Zisserman
CVBM
SSL
24
132
0
21 Aug 2018
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
Samuel Albanie
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
CVBM
30
270
0
16 Aug 2018
Prosodic-Enhanced Siamese Convolutional Neural Networks for Cross-Device
  Text-Independent Speaker Verification
Prosodic-Enhanced Siamese Convolutional Neural Networks for Cross-Device Text-Independent Speaker Verification
Sobhan Soleymani
Ali Dabouei
Seyed Mehdi Iranmanesh
Hadi Kazemi
J. Dawson
Nasser M. Nasrabadi
6
18
0
31 Jul 2018
Speaker Recognition from Raw Waveform with SincNet
Speaker Recognition from Raw Waveform with SincNet
Mirco Ravanelli
Yoshua Bengio
50
700
0
29 Jul 2018
X2Face: A network for controlling face generation by using images,
  audio, and pose codes
X2Face: A network for controlling face generation by using images, audio, and pose codes
Olivia Wiles
A. Sophia Koepke
Andrew Zisserman
CVBM
30
410
0
27 Jul 2018
Unified Hypersphere Embedding for Speaker Recognition
Unified Hypersphere Embedding for Speaker Recognition
Mahdi Hajibabaei
Dengxin Dai
16
86
0
22 Jul 2018
Talking Face Generation by Adversarially Disentangled Audio-Visual
  Representation
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation
Hang Zhou
Yu Liu
Ziwei Liu
Ping Luo
Xiaogang Wang
CVBM
31
436
0
20 Jul 2018
Disjoint Mapping Network for Cross-modal Matching of Voices and Faces
Disjoint Mapping Network for Cross-modal Matching of Voices and Faces
Yandong Wen
Mahmoud Al Ismail
Weiyang Liu
Bhiksha Raj
Rita Singh
FedML
22
70
0
12 Jul 2018
Detection and Analysis of Content Creator Collaborations in YouTube
  Videos using Face- and Speaker-Recognition
Detection and Analysis of Content Creator Collaborations in YouTube Videos using Face- and Speaker-Recognition
Moritz Lode
Michael Örtl
Christian Koch
Amr Rizk
R. Steinmetz
CVBM
6
1
0
05 Jul 2018
Weakly Supervised Training of Speaker Identification Models
Weakly Supervised Training of Speaker Identification Models
Mart Karu
Tanel Alumäe
14
10
0
22 Jun 2018
Unsupervised Learning of Object Landmarks through Conditional Image
  Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Tomas Jakab
Ankush Gupta
Hakan Bilen
Andrea Vedaldi
SSL
33
252
0
20 Jun 2018
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
266
2,238
0
14 Jun 2018
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
207
820
0
12 Jun 2018
Analysis of Length Normalization in End-to-End Speaker Verification
  System
Analysis of Length Normalization in End-to-End Speaker Verification System
Weicheng Cai
Jinkun Chen
Ming Li
VLM
14
39
0
08 Jun 2018
Speaker Clustering Using Dominant Sets
Speaker Clustering Using Dominant Sets
Feliks Hibraj
Sebastiano Vascon
Thilo Stadelmann
Marcello Pelillo
10
4
0
21 May 2018
Sparse Architectures for Text-Independent Speaker Verification Using
  Deep Neural Networks
Sparse Architectures for Text-Independent Speaker Verification Using Deep Neural Networks
Sara Sedighi
Shayan Ramhormozi
9
0
0
19 May 2018
On Learning Associations of Faces and Voices
On Learning Associations of Faces and Voices
Changil Kim
Hijung Valentina Shin
Tae-Hyun Oh
Alexandre Kaspar
Mohamed A. Elgharib
Wojciech Matusik
CVBM
16
83
0
15 May 2018
Supervector Compression Strategies to Speed up I-Vector System
  Development
Supervector Compression Strategies to Speed up I-Vector System Development
Ville Vestman
Tomi Kinnunen
32
3
0
03 May 2018
Learnable PINs: Cross-Modal Embeddings for Person Identity
Learnable PINs: Cross-Modal Embeddings for Person Identity
Arsha Nagrani
Samuel Albanie
Andrew Zisserman
SSL
41
140
0
02 May 2018
End-to-End Residual CNN with L-GM Loss Speaker Verification System
End-to-End Residual CNN with L-GM Loss Speaker Verification System
Xuan Shi
Xingjian Du
Mengyao Zhu
12
5
0
02 May 2018
A Deep Network for Arousal-Valence Emotion Prediction with
  Acoustic-Visual Cues
A Deep Network for Arousal-Valence Emotion Prediction with Acoustic-Visual Cues
Songyou Peng
Le Zhang
Yutong Ban
Mengsha Fang
Stefan Winkler
8
25
0
02 May 2018
Text-Independent Speaker Verification Using Long Short-Term Memory
  Networks
Text-Independent Speaker Verification Using Long Short-Term Memory Networks
Aryan Mobiny
Mohammad Najarian
22
16
0
02 May 2018
Collaborations on YouTube: From Unsupervised Detection to the Impact on
  Video and Channel Popularity
Collaborations on YouTube: From Unsupervised Detection to the Impact on Video and Channel Popularity
Christian Koch
Moritz Lode
Denny Stohr
Amr Rizk
R. Steinmetz
4
5
0
01 May 2018
Exploring the Encoding Layer and Loss Function in End-to-End Speaker and
  Language Recognition System
Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System
Weicheng Cai
Jinkun Chen
Ming Li
16
331
0
14 Apr 2018
Talking Face Generation by Conditional Recurrent Adversarial Network
Talking Face Generation by Conditional Recurrent Adversarial Network
Yang Song
Jingwen Zhu
Dawei Li
Xiaolong Wang
Hairong Qi
CVBM
27
192
0
13 Apr 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Andrew Owens
Alexei A. Efros
SSL
51
745
0
10 Apr 2018
Seeing Voices and Hearing Faces: Cross-modal biometric matching
Seeing Voices and Hearing Faces: Cross-modal biometric matching
Arsha Nagrani
Samuel Albanie
Andrew Zisserman
CVBM
22
219
0
01 Apr 2018
Attentive Statistics Pooling for Deep Speaker Embedding
Attentive Statistics Pooling for Deep Speaker Embedding
K. Okabe
Takafumi Koshinaka
Koichi Shinoda
31
525
0
29 Mar 2018
Fast variational Bayes for heavy-tailed PLDA applied to i-vectors and
  x-vectors
Fast variational Bayes for heavy-tailed PLDA applied to i-vectors and x-vectors
Anna Silnova
Niko Brummer
D. Garcia-Romero
David Snyder
L. Burget
VLM
BDL
14
33
0
24 Mar 2018
Speaker Verification using Convolutional Neural Networks
Speaker Verification using Convolutional Neural Networks
Hossein Salehghaffari
13
25
0
14 Mar 2018
Convolutional Neural Networks and Language Embeddings for End-to-End
  Dialect Recognition
Convolutional Neural Networks and Language Embeddings for End-to-End Dialect Recognition
Suwon Shon
Ahmed M. Ali
James R. Glass
14
61
0
12 Mar 2018
Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA
  model
Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model
Niko Brummer
Anna Silnova
L. Burget
Themos Stafylakis
8
20
0
27 Feb 2018
Neural Predictive Coding using Convolutional Neural Networks towards
  Unsupervised Learning of Speaker Characteristics
Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics
Arindam Jati
P. Georgiou
SSL
21
48
0
22 Feb 2018
Fitting New Speakers Based on a Short Untranscribed Sample
Fitting New Speakers Based on a Short Untranscribed Sample
Eliya Nachmani
Adam Polyak
Yaniv Taigman
Lior Wolf
24
84
0
20 Feb 2018
From Benedict Cumberbatch to Sherlock Holmes: Character Identification
  in TV series without a Script
From Benedict Cumberbatch to Sherlock Holmes: Character Identification in TV series without a Script
Arsha Nagrani
Andrew Zisserman
16
54
0
31 Jan 2018
You said that?
You said that?
Joon Son Chung
A. Jamaludin
Andrew Zisserman
CVBM
23
258
0
08 May 2017
MatConvNet - Convolutional Neural Networks for MATLAB
MatConvNet - Convolutional Neural Networks for MATLAB
Andrea Vedaldi
Karel Lenc
183
2,946
0
15 Dec 2014
Previous
123...202122