ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset
v1v2 (latest)

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,111 papers shown
Title
An Initial Investigation of Neural Replay Simulator for Over-the-Air
  Adversarial Perturbations to Automatic Speaker Verification
An Initial Investigation of Neural Replay Simulator for Over-the-Air Adversarial Perturbations to Automatic Speaker Verification
Jiaqi Li
Li Wang
Liumeng Xue
Lei Wang
Zhizheng Wu
AAML
78
3
0
09 Oct 2023
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
Yangze Li
Fan Yu
Yuhao Liang
Pengcheng Guo
Mohan Shi
Zhihao Du
Shiliang Zhang
Lei Xie
44
4
0
07 Oct 2023
VoiceExtender: Short-utterance Text-independent Speaker Verification
  with Guided Diffusion Model
VoiceExtender: Short-utterance Text-independent Speaker Verification with Guided Diffusion Model
Yayun He
Zuheng Kang
Jianzong Wang
Junqing Peng
Jing Xiao
DiffM
55
2
0
07 Oct 2023
Realistic Speech-to-Face Generation with Speech-Conditioned Latent
  Diffusion Model with Face Prior
Realistic Speech-to-Face Generation with Speech-Conditioned Latent Diffusion Model with Face Prior
Jinting Wang
Li Liu
Jun Wang
Hei Victor Cheng
DiffM
57
2
0
05 Oct 2023
A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized
  Optimization
A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization
Youwang Kim
Lee Hyun
Kim Sung-Bin
Suekyeong Nam
Janghoon Ju
Tae-Hyun Oh
CVBM3DH
63
3
0
04 Oct 2023
Disentangling Voice and Content with Self-Supervision for Speaker
  Recognition
Disentangling Voice and Content with Self-Supervision for Speaker Recognition
Tianchi Liu
Kong Aik Lee
Qiongqiong Wang
Haizhou Li
BDLDRL
98
32
0
02 Oct 2023
Audio-Visual Speaker Verification via Joint Cross-Attention
Audio-Visual Speaker Verification via Joint Cross-Attention
R Gnana Praveen
Jahangir Alam
83
6
0
28 Sep 2023
Rethinking Session Variability: Leveraging Session Embeddings for
  Session Robustness in Speaker Verification
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification
Hee-Soo Heo
Ki-hyun Nam
Bong-Jin Lee
Youngki Kwon
Min-Ji Lee
You Jin Kim
Joon Son Chung
92
2
0
26 Sep 2023
Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Yuke Lin
Xiaoyi Qin
Ning Jiang
Guoqing Zhao
Ming Li
78
3
0
25 Sep 2023
VoiceLDM: Text-to-Speech with Environmental Context
VoiceLDM: Text-to-Speech with Environmental Context
Yeong-Won Lee
In-won Yeon
Juhan Nam
Joon Son Chung
VLMDiffM
75
15
0
24 Sep 2023
Semantic Face Compression for Metaverse: A Compact 3D Descriptor Based
  Approach
Semantic Face Compression for Metaverse: A Compact 3D Descriptor Based Approach
Binzhe Li
Bo Chen
Zhao Wang
Shiqi Wang
Yan Ye
3DH
64
2
0
24 Sep 2023
Efficient Black-Box Speaker Verification Model Adaptation with
  Reprogramming and Backend Learning
Efficient Black-Box Speaker Verification Model Adaptation with Reprogramming and Backend Learning
Jingyu Li
Tan Lee
AAML
68
1
0
24 Sep 2023
Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Dongmei Wang
Xiong Xiao
Naoyuki Kanda
Midia Yousefi
Takuya Yoshioka
Jian Wu
68
4
0
21 Sep 2023
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in
  Speaker Recognition
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition
Shuai Wang
Qibing Bai
Qi Liu
Jianwei Yu
Zhengyang Chen
Bing Han
Yan-min Qian
Haizhou Li
64
1
0
21 Sep 2023
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual
  Representation Models
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Yuan Tseng
Layne Berry
Yi-Ting Chen
I-Hsiang Chiu
Hsuan-Hao Lin
...
Yu Tsao
Shinji Watanabe
Abdel-rahman Mohamed
Chi-Luen Feng
Hung-yi Lee
VLMSSL
125
15
0
19 Sep 2023
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks
Zeyang Song
Jibin Wu
Malu Zhang
Mike Zheng Shou
Haizhou Li
92
4
0
18 Sep 2023
A Generative Framework for Self-Supervised Facial Representation
  Learning
A Generative Framework for Self-Supervised Facial Representation Learning
Ruian He
Zhen Xing
Weimin Tan
Bo Yan
DiffM
63
0
0
15 Sep 2023
DiariST: Streaming Speech Translation with Speaker Diarization
DiariST: Streaming Speech Translation with Speaker Diarization
Muqiao Yang
Naoyuki Kanda
Xiaofei Wang
Junkun Chen
Peidong Wang
Jian Xue
Jinyu Li
Takuya Yoshioka
79
7
0
14 Sep 2023
SLMIA-SR: Speaker-Level Membership Inference Attacks against Speaker
  Recognition Systems
SLMIA-SR: Speaker-Level Membership Inference Attacks against Speaker Recognition Systems
Guangke Chen
Yedi Zhang
Fu Song
80
8
0
14 Sep 2023
Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker
  Verification Using Score-Based Diffusion Probabilistic Models
Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Ju-ho Kim
Ju-Sung Heo
Hyun-Seo Shin
Chanmann Lim
Ha-Jin Yu
DiffM
35
3
0
14 Sep 2023
Codec Data Augmentation for Time-domain Heart Sound Classification
Codec Data Augmentation for Time-domain Heart Sound Classification
Ansh Mishra
J. Yip
Chng Eng Siong
39
1
0
14 Sep 2023
Getting More for Less: Using Weak Labels and AV-Mixup for Robust
  Audio-Visual Speaker Verification
Getting More for Less: Using Weak Labels and AV-Mixup for Robust Audio-Visual Speaker Verification
Anith Selvakumar
H. Fashandi
VLM
65
0
0
13 Sep 2023
SynVox2: Towards a privacy-friendly VoxCeleb2 dataset
SynVox2: Towards a privacy-friendly VoxCeleb2 dataset
Xiaoxiao Miao
Xin Eric Wang
Erica Cooper
Junichi Yamagishi
Nicholas W. D. Evans
Massimiliano Todisco
J. Bonastre
Mickael Rouvier
62
5
0
12 Sep 2023
MaskRenderer: 3D-Infused Multi-Mask Realistic Face Reenactment
MaskRenderer: 3D-Infused Multi-Mask Realistic Face Reenactment
Tina Behrouzi
Atefeh Shahroudnejad
Payam Mousavi
CVBM
46
1
0
10 Sep 2023
Voice Morphing: Two Identities in One Voice
Voice Morphing: Two Identities in One Voice
Sushant Pani
Anurag Chowdhury
Morgan Sandler
Arun Ross
77
1
0
05 Sep 2023
Acoustic-to-articulatory inversion for dysarthric speech: Are
  pre-trained self-supervised representations favorable?
Acoustic-to-articulatory inversion for dysarthric speech: Are pre-trained self-supervised representations favorable?
Sarthak Kumar Maharana
Krishna Kamal Adidam
Shoumik Nandi
Ajitesh Srivastava
91
2
0
03 Sep 2023
From Pixels to Portraits: A Comprehensive Survey of Talking Head
  Generation Techniques and Applications
From Pixels to Portraits: A Comprehensive Survey of Talking Head Generation Techniques and Applications
Shreyank N. Gowda
Dheeraj Pandey
Shashank Narayana Gowda
86
4
0
30 Aug 2023
AGS: An Dataset and Taxonomy for Domestic Scene Sound Event Recognition
AGS: An Dataset and Taxonomy for Domestic Scene Sound Event Recognition
Nan Che
Chenrui Liu
Fei Yu
62
0
0
30 Aug 2023
The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge
Ruoyu Wang
Maokui He
Jun Du
Hengshun Zhou
Shutong Niu
...
Mengzhi Wang
Genshun Wan
Jia Pan
Jianqing Gao
Chin-Hui Lee
59
12
0
28 Aug 2023
Speech Self-Supervised Representations Benchmarking: a Case for Larger
  Probing Heads
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads
Salah Zaiem
Youcef Kemiche
Titouan Parcollet
S. Essid
Mirco Ravanelli
SSL
48
10
0
28 Aug 2023
Unified and Dynamic Graph for Temporal Character Grouping in Long Videos
Unified and Dynamic Graph for Temporal Character Grouping in Long Videos
Xiujun Shu
Wei Wen
Liangsheng Xu
Ruizhi Qiao
Taian Guo
Hanjun Li
Bei Gan
Tianlin Li
Xing Sun
129
0
0
27 Aug 2023
Fairness and Privacy in Voice Biometrics:A Study of Gender Influences
  Using wav2vec 2.0
Fairness and Privacy in Voice Biometrics:A Study of Gender Influences Using wav2vec 2.0
Oubaïda Chouchane
Michele Panariello
Chiara Galdi
Massimiliano Todisco
Nicholas W. D. Evans
49
2
0
27 Aug 2023
ToonTalker: Cross-Domain Face Reenactment
ToonTalker: Cross-Domain Face Reenactment
Yuan Gong
Yong Zhang
Xiaodong Cun
Fei Yin
Yanbo Fan
Xuanxia Wang
Baoyuan Wu
Yujiu Yang
CVBM
77
8
0
24 Aug 2023
DF-3DFace: One-to-Many Speech Synchronized 3D Face Animation with
  Diffusion
DF-3DFace: One-to-Many Speech Synchronized 3D Face Animation with Diffusion
Se Jin Park
Joanna Hong
Minsu Kim
Y. Ro
97
4
0
23 Aug 2023
An Effective Transformer-based Contextual Model and Temporal Gate
  Pooling for Speaker Identification
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification
Harunori Kawano
Sota Shimizu
36
1
0
22 Aug 2023
A Survey on Deep Multi-modal Learning for Body Language Recognition and
  Generation
A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Li Liu
Lufei Gao
Wen-Ling Lei
Fengji Ma
Xiaotian Lin
Jin-Tao Wang
CVBM
84
5
0
17 Aug 2023
DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided
  Speaker Embedding
DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
J. Choi
Joanna Hong
Y. Ro
DiffM
81
22
0
15 Aug 2023
Integrating Emotion Recognition with Speech Recognition and Speaker
  Diarisation for Conversations
Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations
Wen Wu
Chuxu Zhang
P. Woodland
64
4
0
14 Aug 2023
VoxBlink: A Large Scale Speaker Verification Dataset on Camera
VoxBlink: A Large Scale Speaker Verification Dataset on Camera
Yuke Lin
Xiaoyi Qin
Guoqing Zhao
Ming Cheng
Ning Jiang
Haiying Wu
Ming Li
131
18
0
14 Aug 2023
Versatile Face Animator: Driving Arbitrary 3D Facial Avatar in RGBD
  Space
Versatile Face Animator: Driving Arbitrary 3D Facial Avatar in RGBD Space
Haoyu Wang
Haozhe Wu
Junliang Xing
Jia Jia
3DH
39
4
0
11 Aug 2023
Speaker Recognition Using Isomorphic Graph Attention Network Based
  Pooling on Self-Supervised Representation
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation
Zirui Ge
Xinzhou Xu
Haiyan Guo
Tingting Wang
Zhen Yang
SSL
67
2
0
09 Aug 2023
Breaking Speaker Recognition with PaddingBack
Breaking Speaker Recognition with PaddingBack
Zhe Ye
Diqun Yan
Li Dong
Kailai Shen
AAML
75
3
0
08 Aug 2023
Audio-visual video-to-speech synthesis with synthesized input audio
Audio-visual video-to-speech synthesis with synthesized input audio
Triantafyllos Kefalas
Yannis Panagakis
Maja Pantic
VGenDiffM
95
1
0
31 Jul 2023
On-Device Speaker Anonymization of Acoustic Embeddings for ASR based
  onFlexible Location Gradient Reversal Layer
On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer
Md. Asif Jalal
Pablo Peso Parada
Jisi Zhang
Karthikeyan P. Saravanan
Mete Ozay
Myoungji Han
Jung In Lee
Seokyeong Jung
58
1
0
25 Jul 2023
HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and
  Retarget Faces
HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces
Stella Bounareli
Christos Tzelepis
Vasileios Argyriou
Ioannis Patras
Georgios Tzimiropoulos
CVBM
94
36
0
20 Jul 2023
PAS: Partial Additive Speech Data Augmentation Method for Noise Robust
  Speaker Verification
PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification
Wonbin Kim
Hyun-Seo Shin
Ju-ho Kim
Ju-Sung Heo
Chanmann Lim
Ha-Jin Yu
100
0
0
20 Jul 2023
Implicit Identity Representation Conditioned Memory Compensation Network
  for Talking Head video Generation
Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation
Fa-Ting Hong
Dan Xu
CVBM
75
34
0
19 Jul 2023
Exploring Binary Classification Loss For Speaker Verification
Exploring Binary Classification Loss For Speaker Verification
Bing Han
Zhengyang Chen
Y. Qian
CVBM
71
12
0
17 Jul 2023
Representation Learning With Hidden Unit Clustering For Low Resource
  Speech Applications
Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications
Varun Krishna
T. Sai
Sriram Ganapathy
SSL
57
2
0
14 Jul 2023
Retrieving Continuous Time Event Sequences using Neural Temporal Point
  Processes with Learnable Hashing
Retrieving Continuous Time Event Sequences using Neural Temporal Point Processes with Learnable Hashing
Vinayak Gupta
Srikanta J. Bedathur
A. De
AI4TS
75
1
0
13 Jul 2023
Previous
123...567...212223
Next