ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset
v1v2 (latest)

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,111 papers shown
Title
Audio2Head: Audio-driven One-shot Talking-head Generation with Natural
  Head Motion
Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion
Suzhe Wang
Lincheng Li
Yu-qiong Ding
Changjie Fan
Xin Yu
VGen
107
166
0
20 Jul 2021
Controlled AutoEncoders to Generate Faces from Voices
Controlled AutoEncoders to Generate Faces from Voices
Hao Liang
Lulan Yu
Gu Xu
Bhiksha Raj
Rita Singh
CVBM
27
4
0
16 Jul 2021
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding
Hongning Zhu
Kong Aik Lee
Haizhou Li
78
15
0
14 Jul 2021
MACCIF-TDNN: Multi aspect aggregation of channel and context
  interdependence features in TDNN-based speaker verification
MACCIF-TDNN: Multi aspect aggregation of channel and context interdependence features in TDNN-based speaker verification
Fangyuan Wang
Z. Song
Hongchen Jiang
Bo Xu
55
8
0
07 Jul 2021
A Comparative Study of Modular and Joint Approaches for
  Speaker-Attributed ASR on Monaural Long-Form Audio
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio
Naoyuki Kanda
Xiong Xiao
Jian Wu
Tianyan Zhou
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
73
14
0
06 Jul 2021
Multi-modality Deep Restoration of Extremely Compressed Face Videos
Multi-modality Deep Restoration of Extremely Compressed Face Videos
Xi Zhang
Xiaolin Wu
CVBM
80
13
0
05 Jul 2021
The HCCL Speaker Verification System for Far-Field Speaker Verification
  Challenge
The HCCL Speaker Verification System for Far-Field Speaker Verification Challenge
Zhuo Li
Ce Fang
Runqiu Xiao
Zhigao Chen
Wenchao Wang
Yonghong Yan
52
2
0
03 Jul 2021
Pretext Tasks selection for multitask self-supervised speech
  representation learning
Pretext Tasks selection for multitask self-supervised speech representation learning
Salah Zaiem
Titouan Parcollet
S. Essid
Abdel Heba
SSL
89
13
0
01 Jul 2021
What do End-to-End Speech Models Learn about Speaker, Language and
  Channel Information? A Layer-wise and Neuron-level Analysis
What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis
Shammur A. Chowdhury
Nadir Durrani
Ahmed M. Ali
118
16
0
01 Jul 2021
Adversarial Sample Detection for Speaker Verification by Neural Vocoders
Adversarial Sample Detection for Speaker Verification by Neural Vocoders
Haibin Wu
Po-Chun Hsu
Ji Gao
Shanshan Zhang
Shen Huang
Jian Kang
Zhiyong Wu
Helen Meng
Hung-yi Lee
AAML
93
21
0
01 Jul 2021
QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic
  Speech Corpus
QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus
Hamdy Mubarak
A. Hussein
Shammur A. Chowdhury
Ahmed M. Ali
51
48
0
24 Jun 2021
Graph-based Label Propagation for Semi-Supervised Speaker Identification
Graph-based Label Propagation for Semi-Supervised Speaker Identification
Long Chen
Venkatesh Ravichandran
A. Stolcke
SSL
117
16
0
15 Jun 2021
Adaptive Margin Circle Loss for Speaker Verification
Adaptive Margin Circle Loss for Speaker Verification
Runqiu Xiao
120
11
0
15 Jun 2021
Voting for the right answer: Adversarial defense for speaker
  verification
Voting for the right answer: Adversarial defense for speaker verification
Haibin Wu
Yang Zhang
Zhiyong Wu
Dong Wang
Hung-yi Lee
AAML
76
25
0
15 Jun 2021
Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo
  Collection
Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection
Zhenyu Zhang
Yanhao Ge
Renwang Chen
Ying Tai
Yan Yan
Jian Yang
Chengjie Wang
Jilin Li
Feiyue Huang
CVBM3DH
71
26
0
15 Jun 2021
Learning Audio-Visual Dereverberation
Learning Audio-Visual Dereverberation
Changan Chen
Wei-Ju Sun
David Harwath
Kristen Grauman
92
32
0
14 Jun 2021
Visualizing Classifier Adjacency Relations: A Case Study in Speaker
  Verification and Voice Anti-Spoofing
Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing
Tomi Kinnunen
A. Nautsch
Md. Sahidullah
Nicholas W. D. Evans
Xin Wang
Massimiliano Todisco
Héctor Delgado
Junichi Yamagishi
Kong Aik Lee
30
1
0
11 Jun 2021
Unsupervised Co-part Segmentation through Assembly
Unsupervised Co-part Segmentation through Assembly
Qingzhe Gao
Bin Wang
Libin Liu
Baoquan Chen
74
13
0
10 Jun 2021
SpeechBrain: A General-Purpose Speech Toolkit
SpeechBrain: A General-Purpose Speech Toolkit
Mirco Ravanelli
Titouan Parcollet
Peter William VanHarn Plantinga
Aku Rouhe
Samuele Cornell
...
William Aris
Hwidong Na
Yan Gao
R. Mori
Yoshua Bengio
135
770
0
08 Jun 2021
Data Augmentation Methods for End-to-end Speech Recognition on
  Distant-Talk Scenarios
Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios
E. Tsunoo
Kentarou Shibata
Chaitanya Narisetty
Yosuke Kashiwagi
Shinji Watanabe
69
12
0
07 Jun 2021
An objective evaluation of the effects of recording conditions and
  speaker characteristics in multi-speaker deep neural speech synthesis
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis
Beáta Lőrincz
Adriana Stan
M. Giurgiu
33
2
0
03 Jun 2021
APES: Audiovisual Person Search in Untrimmed Video
APES: Audiovisual Person Search in Untrimmed Video
Juan Carlos León Alcázar
Long Mai
Federico Perazzi
Joon-Young Lee
Pablo Arbeláez
Guohao Li
Fabian Caba Heilbron
50
6
0
03 Jun 2021
Improving the Adversarial Robustness for Speaker Verification by
  Self-Supervised Learning
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning
Haibin Wu
Xu Li
Andy T. Liu
Zhiyong Wu
Helen Meng
Hung-yi Lee
AAMLSSL
116
30
0
01 Jun 2021
X-Vectors with Multi-Scale Aggregation for Speaker Diarization
X-Vectors with Multi-Scale Aggregation for Speaker Diarization
Myung-Jae Kim
V. Apsingekar
Divya Neelagiri
51
0
0
16 May 2021
Move2Hear: Active Audio-Visual Source Separation
Move2Hear: Active Audio-Visual Source Separation
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
65
44
0
15 May 2021
Study on the temporal pooling used in deep neural networks for speaker
  verification
Study on the temporal pooling used in deep neural networks for speaker verification
Mickael Rouvier
Pierre-Michel Bousquet
J. Duret
52
6
0
10 May 2021
Voice activity detection in the wild: A data-driven approach using
  teacher-student training
Voice activity detection in the wild: A data-driven approach using teacher-student training
Heinrich Dinkel
Shuai Wang
Xuenan Xu
Mengyue Wu
K. Yu
VLM
40
33
0
10 May 2021
SpeechNet: A Universal Modularized Model for Speech Processing Tasks
SpeechNet: A Universal Modularized Model for Speech Processing Tasks
Yi-Chen Chen
Po-Han Chi
Shu-Wen Yang
Kai-Wei Chang
Jheng-hao Lin
Sung-Feng Huang
Da-Rong Liu
Chi-Liang Liu
Cheng-Kuang Lee
Hung-yi Lee
MoE
64
17
0
07 May 2021
A Good Image Generator Is What You Need for High-Resolution Video
  Synthesis
A Good Image Generator Is What You Need for High-Resolution Video Synthesis
Yu Tian
Jian Ren
Menglei Chai
Kyle Olszewski
Xi Peng
Dimitris N. Metaxas
Sergey Tulyakov
VGen
106
188
0
30 Apr 2021
Personalized Keyphrase Detection using Speaker and Environment
  Information
Personalized Keyphrase Detection using Speaker and Environment Information
R. Rikhye
Quan Wang
Qiao Liang
Yanzhang He
Ding Zhao
Yiteng Huang
Huang
A. Narayanan
Ian McGraw
52
11
0
28 Apr 2021
Multimodal Self-Supervised Learning of General Audio Representations
Multimodal Self-Supervised Learning of General Audio Representations
Luyu Wang
Pauline Luc
Adrià Recasens
Jean-Baptiste Alayrac
Aaron van den Oord
SSL
137
41
0
26 Apr 2021
Motion Representations for Articulated Animation
Motion Representations for Articulated Animation
Aliaksandr Siarohin
Oliver J. Woodford
Jian Ren
Menglei Chai
Sergey Tulyakov
OCL
210
277
0
22 Apr 2021
Voice2Mesh: Cross-Modal 3D Face Model Generation from Voices
Voice2Mesh: Cross-Modal 3D Face Model Generation from Voices
Cho-Ying Wu
Ke Xu
Chin-Cheng Hsu
Ulrich Neumann
CVBM3DH
77
4
0
21 Apr 2021
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection
Junke Wang
Zuxuan Wu
Wenhao Ouyang
Xintong Han
Jingjing Chen
Ser-Nam Lim
Yu-Gang Jiang
ViT
181
277
0
20 Apr 2021
Self-supervised Representation Learning With Path Integral Clustering
  For Speaker Diarization
Self-supervised Representation Learning With Path Integral Clustering For Speaker Diarization
Prachi Singh
Sriram Ganapathy
SSL
57
9
0
19 Apr 2021
Federated Learning of User Verification Models Without Sharing
  Embeddings
Federated Learning of User Verification Models Without Sharing Embeddings
H. Hosseini
Hyunsin Park
Sungrack Yun
Christos Louizos
Joseph B. Soriaga
Max Welling
FedML
46
24
0
18 Apr 2021
Conditional independence for pretext task selection in Self-supervised
  speech representation learning
Conditional independence for pretext task selection in Self-supervised speech representation learning
Salah Zaiem
Titouan Parcollet
S. Essid
SSL
41
4
0
15 Apr 2021
Speaker Attentive Speech Emotion Recognition
Speaker Attentive Speech Emotion Recognition
Clément Le Moine
Nicolas Obin
Axel Roebel
52
13
0
15 Apr 2021
Learning Metrics from Mean Teacher: A Supervised Learning Method for
  Improving the Generalization of Speaker Verification System
Learning Metrics from Mean Teacher: A Supervised Learning Method for Improving the Generalization of Speaker Verification System
Ju-ho Kim
Hye-jin Shim
Jee-weon Jung
Ha-Jin Yu
114
1
0
14 Apr 2021
Exploring Machine Speech Chain for Domain Adaptation and Few-Shot
  Speaker Adaptation
Exploring Machine Speech Chain for Domain Adaptation and Few-Shot Speaker Adaptation
Fengpeng Yue
Yan Deng
Lei He
Tom Ko
70
8
0
08 Apr 2021
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation,
  Recognition and Speaker Diarization in Conference Scenario
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario
Yihui Fu
Luyao Cheng
Shubo Lv
Yukai Jv
Yuxiang Kong
...
Jian Wu
Hui Bu
Xin Xu
Jun Du
Jingdong Chen
112
98
0
08 Apr 2021
Single Source One Shot Reenactment using Weighted motion From Paired
  Feature Points
Single Source One Shot Reenactment using Weighted motion From Paired Feature Points
S. Tripathy
Arno Solin
Esa Rahtu
3DHDiffM
29
8
0
07 Apr 2021
Adapting Speaker Embeddings for Speaker Diarisation
Adapting Speaker Embeddings for Speaker Diarisation
Youngki Kwon
Jee-weon Jung
Hee-Soo Heo
You Jin Kim
Bong-Jin Lee
Joon Son Chung
49
13
0
07 Apr 2021
Speaker embeddings by modeling channel-wise correlations
Speaker embeddings by modeling channel-wise correlations
Themos Stafylakis
Johan Rohdin
L. Burget
81
9
0
06 Apr 2021
Speaker Diarization using Two-pass Leave-One-Out Gaussian PLDA
  Clustering of DNN Embeddings
Speaker Diarization using Two-pass Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings
Kiran Karra
A. McCree
30
2
0
06 Apr 2021
Binary Neural Network for Speaker Verification
Binary Neural Network for Speaker Verification
Tinglong Zhu
Xiaoyi Qin
Ming Li
MQ
51
12
0
06 Apr 2021
End-to-End Speaker-Attributed ASR with Transformer
End-to-End Speaker-Attributed ASR with Transformer
Naoyuki Kanda
Guoli Ye
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Zhuo Chen
Takuya Yoshioka
75
49
0
05 Apr 2021
Streaming Multi-talker Speech Recognition with Joint Speaker
  Identification
Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Liang Lu
Naoyuki Kanda
Jinyu Li
Jiawei Liu
75
20
0
05 Apr 2021
Attention Back-end for Automatic Speaker Verification with Multiple
  Enrollment Utterances
Attention Back-end for Automatic Speaker Verification with Multiple Enrollment Utterances
Chang Zeng
Xin Wang
Erica Cooper
Xiaoxiao Miao
Junichi Yamagishi
93
21
0
04 Apr 2021
Diarization of Legal Proceedings. Identifying and Transcribing Judicial
  Speech from Recorded Court Audio
Diarization of Legal Proceedings. Identifying and Transcribing Judicial Speech from Recorded Court Audio
Jeffrey Tumminia
Amanda Kuznecov
Sophia Tsilerides
Ilana Weinstein
Brian McFee
M. Picheny
A. Kaufman
39
1
0
03 Apr 2021
Previous
123...141516...212223
Next