ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXivPDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,098 papers shown
Title
Segment Aggregation for short utterances speaker verification using raw
  waveforms
Segment Aggregation for short utterances speaker verification using raw waveforms
Seung-bin Kim
Jee-weon Jung
Hye-jin Shim
Ju-ho Kim
Ha-Jin Yu
8
5
0
07 May 2020
AutoSpeech: Neural Architecture Search for Speaker Recognition
AutoSpeech: Neural Architecture Search for Speaker Recognition
Shaojin Ding
Tianlong Chen
Xinyu Gong
Weiwei Zha
Zhangyang Wang
28
57
0
07 May 2020
What comprises a good talking-head video generation?: A Survey and
  Benchmark
What comprises a good talking-head video generation?: A Survey and Benchmark
Lele Chen
Guofeng Cui
Ziyi Kou
Haitian Zheng
Chenliang Xu
EGVM
26
58
0
07 May 2020
Introducing the VoicePrivacy Initiative
Introducing the VoicePrivacy Initiative
N. Tomashenko
B. M. L. Srivastava
Xin Wang
Emmanuel Vincent
A. Nautsch
...
Nicholas W. D. Evans
J. Patino
J. Bonastre
Paul-Gauthier Noé
Massimiliano Todisco
40
128
0
04 May 2020
VGGSound: A Large-scale Audio-Visual Dataset
VGGSound: A Large-scale Audio-Visual Dataset
Honglie Chen
Weidi Xie
Andrea Vedaldi
Andrew Zisserman
10
556
0
29 Apr 2020
Seeing voices and hearing voices: learning discriminative embeddings
  using cross-modal self-supervision
Seeing voices and hearing voices: learning discriminative embeddings using cross-modal self-supervision
Soo-Whan Chung
Hong-Goo Kang
Joon Son Chung
SSL
19
38
0
29 Apr 2020
Cross-modal Speaker Verification and Recognition: A Multilingual
  Perspective
Cross-modal Speaker Verification and Recognition: A Multilingual Perspective
M. S. Saeed
Shah Nawaz
Pietro Morerio
Arif Mahmood
I. Gallo
Muhammad Haroon Yousaf
Alessio Del Bue
CVBM
26
25
0
28 Apr 2020
Neural Head Reenactment with Latent Pose Descriptors
Neural Head Reenactment with Latent Pose Descriptors
Egor Burkov
I. Pasechnik
Artur Grigorev
Victor Lempitsky
3DH
34
130
0
24 Apr 2020
Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving
  Speech Data Release
Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release
Yaowei Han
Sheng Li
Yang Cao
Qiang Ma
Masatoshi Yoshikawa
10
45
0
16 Apr 2020
From Inference to Generation: End-to-end Fully Self-supervised
  Generation of Human Face from Speech
From Inference to Generation: End-to-end Fully Self-supervised Generation of Human Face from Speech
Hyeong-Seok Choi
Changdae Park
Kyogu Lee
CVBM
17
29
0
13 Apr 2020
Bayesian x-vector: Bayesian Neural Network based x-vector System for
  Speaker Verification
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification
Xu Li
Jinghua Zhong
Jianwei Yu
Shoukang Hu
Xixin Wu
Xunying Liu
Helen Meng
BDL
15
11
0
08 Apr 2020
Semi-supervised acoustic modelling for five-lingual code-switched ASR
  using automatically-segmented soap opera speech
Semi-supervised acoustic modelling for five-lingual code-switched ASR using automatically-segmented soap opera speech
N. Wilkinson
A. Biswas
Emre Yilmaz
Febe de Wet
Ewald van der Westhuizen
T. Niesler
25
10
0
08 Apr 2020
Motion-supervised Co-Part Segmentation
Motion-supervised Co-Part Segmentation
Aliaksandr Siarohin
Subhankar Roy
Stéphane Lathuilière
Sergey Tulyakov
Elisa Ricci
N. Sebe
SSL
13
35
0
07 Apr 2020
Deep Normalization for Speaker Vectors
Deep Normalization for Speaker Vectors
Yunqi Cai
Lantian Li
Dong Wang
Andrew Abel
42
25
0
07 Apr 2020
Improving Multi-Scale Aggregation Using Feature Pyramid Module for
  Robust Speaker Verification of Variable-Duration Utterances
Improving Multi-Scale Aggregation Using Feature Pyramid Module for Robust Speaker Verification of Variable-Duration Utterances
Youngmoon Jung
Seong Min Kye
Yeunju Choi
Myunghun Jung
Hoirin Kim
20
36
0
07 Apr 2020
Meta-Learning for Short Utterance Speaker Recognition with Imbalance
  Length Pairs
Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs
Seong Min Kye
Youngmoon Jung
Haebeom Lee
Sung Ju Hwang
Hoirin Kim
30
49
0
06 Apr 2020
Speaker Recognition using SincNet and X-Vector Fusion
Speaker Recognition using SincNet and X-Vector Fusion
Mayank Tripathi
Divyanshu Singh
Seba Susan
25
7
0
05 Apr 2020
Neural i-vectors
Neural i-vectors
Ville Vestman
Kong Aik Lee
Tomi Kinnunen
DRL
14
4
0
03 Apr 2020
Temporarily-Aware Context Modelling using Generative Adversarial
  Networks for Speech Activity Detection
Temporarily-Aware Context Modelling using Generative Adversarial Networks for Speech Activity Detection
Tharindu Fernando
Sridha Sridharan
Mitchell McLaren
Darshana Priyasad
Simon Denman
Clinton Fookes
14
5
0
02 Apr 2020
Improved RawNet with Feature Map Scaling for Text-independent Speaker
  Verification using Raw Waveforms
Improved RawNet with Feature Map Scaling for Text-independent Speaker Verification using Raw Waveforms
Jee-weon Jung
Seung-bin Kim
Hye-jin Shim
Ju-ho Kim
Ha-Jin Yu
20
60
0
01 Apr 2020
AM-MobileNet1D: A Portable Model for Speaker Recognition
AM-MobileNet1D: A Portable Model for Speaker Recognition
João Antônio Chagas Nunes
David Macêdo
Cleber Zanchettin
20
22
0
31 Mar 2020
A Comparison of Metric Learning Loss Functions for End-To-End Speaker
  Verification
A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification
Juan Manuel Coria
H. Bredin
Sahar Ghannay
S. Rosset
25
15
0
31 Mar 2020
Realistic Face Reenactment via Self-Supervised Disentangling of Identity
  and Pose
Realistic Face Reenactment via Self-Supervised Disentangling of Identity and Pose
Xianfang Zeng
Yusu Pan
Mengmeng Wang
Jiangning Zhang
Yong Liu
CVBM
17
42
0
29 Mar 2020
Learning Inverse Rendering of Faces from Real-world Videos
Learning Inverse Rendering of Faces from Real-world Videos
Yuda Qiu
Zhangyang Xiong
Kai Han
Zhongyuan Wang
Zixiang Xiong
Xiaoguang Han
CVBM
3DH
19
2
0
26 Mar 2020
In defence of metric learning for speaker recognition
In defence of metric learning for speaker recognition
Joon Son Chung
Jaesung Huh
Seongkyu Mun
Minjae Lee
Hee-Soo Heo
Soyeon Choe
Chiheon Ham
Sung-Ye Jung
Bong-Jin Lee
Icksang Han
32
432
0
26 Mar 2020
Improving Embedding Extraction for Speaker Verification with Ladder
  Network
Improving Embedding Extraction for Speaker Verification with Ladder Network
Fei Tao
Gokhan Tur
8
3
0
20 Mar 2020
Deep Neural Networks for Automatic Speech Processing: A Survey from
  Large Corpora to Limited Data
Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Vincent Roger
Jérôme Farinas
J. Pinquier
33
23
0
09 Mar 2020
Lightweight Speaker Verification for Online Identification of New
  Speakers with Short Segments
Lightweight Speaker Verification for Online Identification of New Speakers with Short Segments
I. Vélez
C. Rascón
Gibran Fuentes Pineda
27
10
0
06 Mar 2020
First Order Motion Model for Image Animation
First Order Motion Model for Image Animation
Aliaksandr Siarohin
Stéphane Lathuilière
Sergey Tulyakov
Elisa Ricci
N. Sebe
VGen
DiffM
36
912
0
29 Feb 2020
Bio-Inspired Modality Fusion for Active Speaker Detection
Bio-Inspired Modality Fusion for Active Speaker Detection
Gustavo Assunção
Nuno Gonccalves
Paulo Menezes
11
3
0
28 Feb 2020
Speech2Phone: A Novel and Efficient Method for Training Speaker
  Recognition Models
Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition Models
Edresson Casanova
Arnaldo Cândido Júnior
C. Shulby
F. S. Oliveira
L. Gris
Hamilton Pereira da Silva
S. Aluísio
M. Ponti
11
2
0
25 Feb 2020
Towards Learning a Universal Non-Semantic Representation of Speech
Towards Learning a Universal Non-Semantic Representation of Speech
Joel Shor
A. Jansen
Ronnie Maor
Oran Lang
Omry Tuval
Félix de Chaumont Quitry
Marco Tagliasacchi
Ira Shavitt
Dotan Emanuel
Yinnon A. Haviv
SSL
44
155
0
25 Feb 2020
Audio-driven Talking Face Video Generation with Learning-based
  Personalized Head Pose
Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose
Ran Yi
Zipeng Ye
Juyong Zhang
Hujun Bao
Yong-jin Liu
CVBM
27
122
0
24 Feb 2020
DIHARD II is Still Hard: Experimental Results and Discussions from the
  DKU-LENOVO Team
DIHARD II is Still Hard: Experimental Results and Discussions from the DKU-LENOVO Team
Qingjian Lin
Weicheng Cai
Lin Yang
Junjie Wang
J. Zhang
Ming Li
VLM
8
18
0
23 Feb 2020
An end-to-end approach for the verification problem: learning the right
  distance
An end-to-end approach for the verification problem: learning the right distance
João Monteiro
Isabela Albuquerque
Md. Jahangir Alam
R. Devon Hjelm
T. Falk
24
6
0
21 Feb 2020
Disentangled Speech Embeddings using Cross-modal Self-supervision
Disentangled Speech Embeddings using Cross-modal Self-supervision
Arsha Nagrani
Joon Son Chung
Samuel Albanie
Andrew Zisserman
SSL
21
88
0
20 Feb 2020
Speaker Diarization with Region Proposal Network
Speaker Diarization with Region Proposal Network
Zili Huang
Shinji Watanabe
Yusuke Fujita
Leibny Paola García-Perera
Yiwen Shao
Daniel Povey
Sanjeev Khudanpur
6
60
0
14 Feb 2020
Deep Speaker Embeddings for Far-Field Speaker Recognition on Short
  Utterances
Deep Speaker Embeddings for Far-Field Speaker Recognition on Short Utterances
Aleksei Gusev
V. Volokhov
Tseren Andzhukaev
Sergey Novoselov
G. Lavrentyeva
...
Anastasia Avdeeva
Artem Ivanov
Alexander Kozlov
Timur Pekhovsky
Yuri N. Matveev
28
47
0
14 Feb 2020
Self-supervised learning for audio-visual speaker diarization
Self-supervised learning for audio-visual speaker diarization
Yifan Ding
Yong-mei Xu
Shi-Xiong Zhang
Yahuan Cong
Liqiang Wang
VLM
39
29
0
13 Feb 2020
AlignNet: A Unifying Approach to Audio-Visual Alignment
AlignNet: A Unifying Approach to Audio-Visual Alignment
Jianren Wang
Zhaoyuan Fang
Hang Zhao
10
37
0
12 Feb 2020
NPLDA: A Deep Neural PLDA Model for Speaker Verification
NPLDA: A Deep Neural PLDA Model for Speaker Verification
Shreyas Ramoji
Prashant Krishnan
Sriram Ganapathy
16
31
0
10 Feb 2020
An empirical analysis of information encoded in disentangled neural
  speaker representations
An empirical analysis of information encoded in disentangled neural speaker representations
Raghuveer Peri
Haoqi Li
Krishna Somandepalli
Arindam Jati
Shrikanth Narayanan
DRL
27
13
0
10 Feb 2020
$M^3$T: Multi-Modal Continuous Valence-Arousal Estimation in the Wild
M3M^3M3T: Multi-Modal Continuous Valence-Arousal Estimation in the Wild
Yuanhang Zhang
Rulin Huang
Jiabei Zeng
Shiguang Shan
Xilin Chen
CVBM
17
27
0
07 Feb 2020
LEAP System for SRE19 CTS Challenge -- Improvements and Error Analysis
LEAP System for SRE19 CTS Challenge -- Improvements and Error Analysis
Shreyas Ramoji
Prashant Krishnan
Bhargavram Mysore
Prachi Singh
Sriram Ganapathy
9
2
0
07 Feb 2020
An initial investigation on optimizing tandem speaker verification and
  countermeasure systems using reinforcement learning
An initial investigation on optimizing tandem speaker verification and countermeasure systems using reinforcement learning
Anssi Kanervisto
Ville Hautamaki
Tomi Kinnunen
Junichi Yamagishi
19
2
0
06 Feb 2020
Within-sample variability-invariant loss for robust speaker recognition
  under noisy environments
Within-sample variability-invariant loss for robust speaker recognition under noisy environments
Danwei Cai
Weicheng Cai
Ming Li
29
46
0
03 Feb 2020
DropClass and DropAdapt: Dropping classes for deep speaker
  representation learning
DropClass and DropAdapt: Dropping classes for deep speaker representation learning
Chau Luu
P. Bell
Steve Renals
VLM
16
3
0
02 Feb 2020
Analysis of Deep Feature Loss based Enhancement for Speaker Verification
Analysis of Deep Feature Loss based Enhancement for Speaker Verification
Saurabh Kataria
P. S. Nidadavolu
Jesús Villalba
Najim Dehak
21
13
0
01 Feb 2020
MCSAE: Masked Cross Self-Attentive Encoding for Speaker Embedding
MCSAE: Masked Cross Self-Attentive Encoding for Speaker Embedding
Soonshin Seo
Ji-Hwan Kim
15
0
0
28 Jan 2020
Pairwise Discriminative Neural PLDA for Speaker Verification
Pairwise Discriminative Neural PLDA for Speaker Verification
Shreyas Ramoji
Prashant Krishnan
Prachi Singh
Sriram Ganapathy
14
7
0
20 Jan 2020
Previous
123...1819202122
Next