ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1706.08612
  4. Cited By
VoxCeleb: a large-scale speaker identification dataset

VoxCeleb: a large-scale speaker identification dataset

26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
ArXivPDFHTML

Papers citing "VoxCeleb: a large-scale speaker identification dataset"

50 / 1,098 papers shown
Title
Scalable Ensemble-based Detection Method against Adversarial Attacks for
  speaker verification
Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification
Haibin Wu
Heng-Cheng Kuo
Yu Tsao
Hung-yi Lee
AAML
32
1
0
14 Dec 2023
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for
  Speaker Verification
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification
Hyunjun Heo
U.H Shin
Ran Lee
YoungJu Cheon
Hyung-Min Park
31
9
0
14 Dec 2023
GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained
  3D Face Guidance
GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained 3D Face Guidance
Haiming Zhang
Zhihao Yuan
Chaoda Zheng
Xu Yan
Baoyuan Wang
Guanbin Li
Song Wu
Shuguang Cui
Zhen Li
CVBM
55
1
0
12 Dec 2023
EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed
  Speaker Embeddings
EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings
Sung Hwan Mun
Mingrui Han
Canyeong Moon
Nam Soo Kim
42
1
0
11 Dec 2023
Speaker-Text Retrieval via Contrastive Learning
Speaker-Text Retrieval via Contrastive Learning
Xuechen Liu
Xin Wang
Erica Cooper
Xiaoxiao Miao
Junichi Yamagishi
VLM
22
0
0
11 Dec 2023
Neural Concatenative Singing Voice Conversion: Rethinking
  Concatenation-Based Approach for One-Shot Singing Voice Conversion
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
Binzhu Sha
Xu Li
Zhiyong Wu
Yin Shan
Helen M. Meng
23
7
0
08 Dec 2023
SingingHead: A Large-scale 4D Dataset for Singing Head Animation
SingingHead: A Large-scale 4D Dataset for Singing Head Animation
Sijing Wu
Yunhao Li
Weitian Zhang
Jun Jia
Yucheng Zhu
Yichao Yan
Guangtao Zhai
Xiaokang Yang
49
2
0
07 Dec 2023
Joint Training or Not: An Exploration of Pre-trained Speech Models in
  Audio-Visual Speaker Diarization
Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization
Huan Zhao
Li Zhang
Yuehong Li
Yannan Wang
Hongji Wang
Wei Rao
Qing Wang
Lei Xie
10
0
0
07 Dec 2023
Golden Gemini is All You Need: Finding the Sweet Spots for Speaker
  Verification
Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification
Tianchi Liu
Kong Aik Lee
Qiongqiong Wang
Haizhou Li
VLM
78
13
0
06 Dec 2023
VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D
  Hybrid Prior
VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior
Xusen Sun
Longhao Zhang
Hao Zhu
Peng Zhang
Bang Zhang
Xinya Ji
Kangneng Zhou
Daiheng Gao
Liefeng Bo
Xun Cao
VGen
33
24
0
04 Dec 2023
DiffSLVA: Harnessing Diffusion Models for Sign Language Video
  Anonymization
DiffSLVA: Harnessing Diffusion Models for Sign Language Video Anonymization
Zhaoyang Xia
C. Neidle
Dimitris N. Metaxas
DiffM
46
3
0
27 Nov 2023
Phonetic-aware speaker embedding for far-field speaker verification
Phonetic-aware speaker embedding for far-field speaker verification
Zezhong Jin
Youzhi Tu
Man-Wai Mak
25
1
0
27 Nov 2023
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and
  LAnguage in Conversational Environments
Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel
Shreyas Ramoji
Somil Jain
Pratik Roy Chowdhuri
Prachi Singh
Deepu Vijayasenan
Sriram Ganapathy
30
6
0
21 Nov 2023
Video Face Re-Aging: Toward Temporally Consistent Face Re-Aging
Video Face Re-Aging: Toward Temporally Consistent Face Re-Aging
Abdul Muqeet
Kyuchul Lee
Bumsoo Kim
Yohan Hong
Hyungrae Lee
Woonggon Kim
Kwang Hee Lee
35
3
0
20 Nov 2023
SponTTS: modeling and transferring spontaneous style for TTS
SponTTS: modeling and transferring spontaneous style for TTS
Hanzhao Li
Xinfa Zhu
Liumeng Xue
Yang Song
Yunlin Chen
Lei Xie
43
7
0
13 Nov 2023
CVTHead: One-shot Controllable Head Avatar with Vertex-feature
  Transformer
CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer
Haoyu Ma
Tong Zhang
Shanlin Sun
Xiangyi Yan
Kun Han
Xiaohui Xie
34
5
0
11 Nov 2023
On the Behavior of Audio-Visual Fusion Architectures in Identity
  Verification Tasks
On the Behavior of Audio-Visual Fusion Architectures in Identity Verification Tasks
Daniel Claborne
Eric Slyman
Karl Pazdernik
20
0
0
09 Nov 2023
CapST: An Enhanced and Lightweight Model Attribution Approach for Synthetic Videos
Wasim Ahmad
Yan-Tsung Peng
Yuan-Hao Chang
Gaddisa Olani Ganfure
Sarwar Khan
Sahibzada Adil Shahzad
23
0
0
07 Nov 2023
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency
  for Video Deepfake Detection
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection
Sahibzada Adil Shahzad
Ammarah Hashmi
Yan-Tsung Peng
Yu Tsao
Hsin-Min Wang
34
5
0
05 Nov 2023
Generative Face Video Coding Techniques and Standardization Efforts: A
  Review
Generative Face Video Coding Techniques and Standardization Efforts: A Review
Bo Chen
Jie Chen
Shiqi Wang
Yan Ye
33
12
0
05 Nov 2023
3D-Aware Talking-Head Video Motion Transfer
3D-Aware Talking-Head Video Motion Transfer
Haomiao Ni
Jiachen Liu
Yuan Xue
S. X. Huang
DiffM
61
3
0
05 Nov 2023
LaughTalk: Expressive 3D Talking Head Generation with Laughter
LaughTalk: Expressive 3D Talking Head Generation with Laughter
Kim Sung-Bin
Lee Hyun
Da Hye Hong
Suekyeong Nam
Janghoon Ju
Tae-Hyun Oh
28
22
0
02 Nov 2023
Deep Neural Networks for Automatic Speaker Recognition Do Not Learn
  Supra-Segmental Temporal Features
Deep Neural Networks for Automatic Speaker Recognition Do Not Learn Supra-Segmental Temporal Features
Daniel Neururer
Volker Dellwo
Thilo Stadelmann
41
2
0
01 Nov 2023
MM-VID: Advancing Video Understanding with GPT-4V(ision)
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Kevin Qinghong Lin
Faisal Ahmed
Linjie Li
Chung-Ching Lin
E. Azarnasab
...
Lin Liang
Zicheng Liu
Yumao Lu
Ce Liu
Lijuan Wang
MLLM
28
63
0
30 Oct 2023
Learning Repeatable Speech Embeddings Using An Intra-class Correlation
  Regularizer
Learning Repeatable Speech Embeddings Using An Intra-class Correlation Regularizer
Jianwei Zhang
Suren Jayasuriya
Visar Berisha
SSL
32
2
0
25 Oct 2023
HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis
HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis
Nafis Irtiza Tripto
Adaku Uchendu
Thai V. Le
Mattia Setzu
F. Giannotti
Dongwon Lee
DeLMO
31
6
0
25 Oct 2023
Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked
  Auto-Encoder
Modality-Agnostic Self-Supervised Learning with Meta-Learned Masked Auto-Encoder
Huiwon Jang
Jihoon Tack
Daewon Choi
Jongheon Jeong
Jinwoo Shin
21
2
0
25 Oct 2023
Diffusion-Based Adversarial Purification for Speaker Verification
Diffusion-Based Adversarial Purification for Speaker Verification
Yibo Bai
Ju Liu
Xuelong Li
DiffM
38
2
0
22 Oct 2023
Learning Motion Refinement for Unsupervised Face Animation
Learning Motion Refinement for Unsupervised Face Animation
Jiale Tao
Shuhang Gu
Wen Li
Lixin Duan
CVBM
3DH
26
4
0
21 Oct 2023
The CHiME-7 Challenge: System Description and Performance of NeMo Team's
  DASR System
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System
T. Park
He Huang
Ante Jukić
Kunal Dhawan
Krishna C. Puvvada
Nithin Rao Koluguri
Nikolay Karpov
A. Laptev
Jagadeesh Balam
Boris Ginsburg
35
6
0
18 Oct 2023
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling
  Technique for Synthetic Data Generation
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation
T. Park
He Huang
Coleman Hooper
Nithin Rao Koluguri
Kunal Dhawan
Ante Jukić
Jagadeesh Balam
Boris Ginsburg
21
7
0
18 Oct 2023
DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification
DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification
Yuanyuan Wang
Yang Zhang
Zhiyong Wu
Zhihan Yang
Tao Wei
Kun Zou
Helen M. Meng
25
1
0
18 Oct 2023
LocSelect: Target Speaker Localization with an Auditory Selective
  Hearing Mechanism
LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism
Yu Chen
Xinyuan Qian
Zexu Pan
Kainan Chen
Haizhou Li
14
2
0
16 Oct 2023
Expression Domain Translation Network for Cross-domain Head Reenactment
Expression Domain Translation Network for Cross-domain Head Reenactment
Taewoong Kang
Jeongsik Oh
Jaeseong Lee
S. Park
Jaegul Choo
31
0
0
16 Oct 2023
End-to-end Online Speaker Diarization with Target Speaker Tracking
End-to-end Online Speaker Diarization with Target Speaker Tracking
Weiqing Wang
Ming Li
39
5
0
12 Oct 2023
Cost-Driven Hardware-Software Co-Optimization of Machine Learning
  Pipelines
Cost-Driven Hardware-Software Co-Optimization of Machine Learning Pipelines
Ravit Sharma
W. Romaszkan
Feiqian Zhu
Puneet Gupta
Ankur Mehta
27
0
0
11 Oct 2023
An Initial Investigation of Neural Replay Simulator for Over-the-Air
  Adversarial Perturbations to Automatic Speaker Verification
An Initial Investigation of Neural Replay Simulator for Over-the-Air Adversarial Perturbations to Automatic Speaker Verification
Jiaqi Li
Li Wang
Liumeng Xue
Lei Wang
Zhizheng Wu
AAML
40
3
0
09 Oct 2023
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASR
Yangze Li
Fan Yu
Yuhao Liang
Pengcheng Guo
Mohan Shi
Zhihao Du
Shiliang Zhang
Lei Xie
24
3
0
07 Oct 2023
VoiceExtender: Short-utterance Text-independent Speaker Verification
  with Guided Diffusion Model
VoiceExtender: Short-utterance Text-independent Speaker Verification with Guided Diffusion Model
Yayun He
Zuheng Kang
Jianzong Wang
Junqing Peng
Jing Xiao
DiffM
27
2
0
07 Oct 2023
Realistic Speech-to-Face Generation with Speech-Conditioned Latent
  Diffusion Model with Face Prior
Realistic Speech-to-Face Generation with Speech-Conditioned Latent Diffusion Model with Face Prior
Jinting Wang
Li Liu
Jun Wang
Hei Victor Cheng
DiffM
20
2
0
05 Oct 2023
A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized
  Optimization
A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization
Youwang Kim
Lee Hyun
Kim Sung-Bin
Suekyeong Nam
Janghoon Ju
Tae-Hyun Oh
CVBM
3DH
31
3
0
04 Oct 2023
Disentangling Voice and Content with Self-Supervision for Speaker
  Recognition
Disentangling Voice and Content with Self-Supervision for Speaker Recognition
Tianchi Liu
Kong Aik Lee
Qiongqiong Wang
Haizhou Li
BDL
DRL
35
31
0
02 Oct 2023
Audio-Visual Speaker Verification via Joint Cross-Attention
Audio-Visual Speaker Verification via Joint Cross-Attention
R Gnana Praveen
Jahangir Alam
34
6
0
28 Sep 2023
Rethinking Session Variability: Leveraging Session Embeddings for
  Session Robustness in Speaker Verification
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification
Hee-Soo Heo
Ki-hyun Nam
Bong-Jin Lee
Youngki Kwon
Min-Ji Lee
You Jin Kim
Joon Son Chung
32
1
0
26 Sep 2023
Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Yuke Lin
Xiaoyi Qin
Ning Jiang
Guoqing Zhao
Ming Li
42
3
0
25 Sep 2023
VoiceLDM: Text-to-Speech with Environmental Context
VoiceLDM: Text-to-Speech with Environmental Context
Yeong-Won Lee
In-won Yeon
Juhan Nam
Joon Son Chung
VLM
DiffM
27
10
0
24 Sep 2023
Semantic Face Compression for Metaverse: A Compact 3D Descriptor Based
  Approach
Semantic Face Compression for Metaverse: A Compact 3D Descriptor Based Approach
Binzhe Li
Bo Chen
Zhao Wang
Shiqi Wang
Yan Ye
3DH
36
2
0
24 Sep 2023
Efficient Black-Box Speaker Verification Model Adaptation with
  Reprogramming and Backend Learning
Efficient Black-Box Speaker Verification Model Adaptation with Reprogramming and Backend Learning
Jingyu Li
Tan Lee
AAML
27
1
0
24 Sep 2023
Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Dongmei Wang
Xiong Xiao
Naoyuki Kanda
Midia Yousefi
Takuya Yoshioka
Jian Wu
21
3
0
21 Sep 2023
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in
  Speaker Recognition
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition
Shuai Wang
Qibing Bai
Qi Liu
Jianwei Yu
Zhengyang Chen
Bing Han
Yan-min Qian
Haizhou Li
27
1
0
21 Sep 2023
Previous
123456...202122
Next