ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.05622
  4. Cited By
VoxCeleb2: Deep Speaker Recognition

VoxCeleb2: Deep Speaker Recognition

14 June 2018
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
ArXivPDFHTML

Papers citing "VoxCeleb2: Deep Speaker Recognition"

50 / 773 papers shown
Title
A Multi-tasking Model of Speaker-Keyword Classification for Keeping
  Human in the Loop of Drone-assisted Inspection
A Multi-tasking Model of Speaker-Keyword Classification for Keeping Human in the Loop of Drone-assisted Inspection
Yu Li
Anisha Parsan
Bill Wang
Penghao Dong
Shanshan Yao
Ruwen Qin
29
5
0
08 Jul 2022
Generating gender-ambiguous voices for privacy-preserving speech
  recognition
Generating gender-ambiguous voices for privacy-preserving speech recognition
Dimitrios Stoidis
Andrea Cavallaro
36
14
0
03 Jul 2022
Speaker Verification in Multi-Speaker Environments Using Temporal
  Feature Fusion
Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion
Ahmad Aloradi
Wolfgang Mack
Mohamed Elminshawi
Emanuel Habets
35
5
0
28 Jun 2022
Domain Agnostic Few-shot Learning for Speaker Verification
Domain Agnostic Few-shot Learning for Speaker Verification
Seunghan Yang
Debasmit Das
Jang Hyun Cho
Hyoungwoo Park
Sungrack Yun
OOD
19
7
0
28 Jun 2022
Learning from human perception to improve automatic speaker verification
  in style-mismatched conditions
Learning from human perception to improve automatic speaker verification in style-mismatched conditions
Amber Afshan
Abeer Alwan
VLM
23
1
0
28 Jun 2022
Rethinking Audio-visual Synchronization for Active Speaker Detection
Rethinking Audio-visual Synchronization for Active Speaker Detection
Abudukelimu Wuerkaixi
You Zhang
Z. Duan
Changshui Zhang
18
10
0
21 Jun 2022
Self-Supervised Learning for Videos: A Survey
Self-Supervised Learning for Videos: A Survey
Madeline Chantry Schiappa
Yogesh S Rawat
M. Shah
SSL
36
131
0
18 Jun 2022
Realistic One-shot Mesh-based Head Avatars
Realistic One-shot Mesh-based Head Avatars
Taras Khakhulin
V. Sklyarova
Victor Lempitsky
Egor Zakharov
3DH
34
97
0
16 Jun 2022
AS2T: Arbitrary Source-To-Target Adversarial Attack on Speaker
  Recognition Systems
AS2T: Arbitrary Source-To-Target Adversarial Attack on Speaker Recognition Systems
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Yang Liu
AAML
35
18
0
07 Jun 2022
DT-SV: A Transformer-based Time-domain Approach for Speaker Verification
DT-SV: A Transformer-based Time-domain Approach for Speaker Verification
Nan Zhang
Jianzong Wang
Zhenhou Hong
Chendong Zhao
Xiaoyang Qu
Jing Xiao
37
5
0
26 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable
  Convolutions
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
32
8
0
19 May 2022
The VoicePrivacy 2020 Challenge Evaluation Plan
The VoicePrivacy 2020 Challenge Evaluation Plan
N. Tomashenko
B. M. L. Srivastava
Xin Wang
Emmanuel Vincent
A. Nautsch
...
Nicholas W. D. Evans
J. Patino
J. Bonastre
Paul-Gauthier Noé
Massimiliano Todisco
32
43
0
14 May 2022
Collar-aware Training for Streaming Speaker Change Detection in
  Broadcast Speech
Collar-aware Training for Streaming Speaker Change Detection in Broadcast Speech
Joonas Kalda
Tanel Alumäe
13
3
0
14 May 2022
F3A-GAN: Facial Flow for Face Animation with Generative Adversarial
  Networks
F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks
Xintian Wu
Qihang Zhang
Yiming Wu
Huanyu Wang
Songyuan Li
Lingyun Sun
Xi Li
CVBM
3DH
39
7
0
12 May 2022
Efficient dynamic filter for robust and low computational feature
  extraction
Efficient dynamic filter for robust and low computational feature extraction
Donghyeon Kim
Gwantae Kim
Bokyeung Lee
Jeong-gi Kwak
D. Han
Hanseok Ko
28
3
0
03 May 2022
Emotion-Controllable Generalized Talking Face Generation
Emotion-Controllable Generalized Talking Face Generation
Sanjana Sinha
S. Biswas
Ravindra Yadav
Brojeshwar Bhowmick
CVBM
18
49
0
02 May 2022
Baselines and Protocols for Household Speaker Recognition
Baselines and Protocols for Household Speaker Recognition
A. Sholokhov
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
25
4
0
30 Apr 2022
Sound Localization by Self-Supervised Time Delay Estimation
Sound Localization by Self-Supervised Time Delay Estimation
Ziyang Chen
David Fouhey
Andrew Owens
SSL
27
19
0
26 Apr 2022
Back-ends Selection for Deep Speaker Embeddings
Back-ends Selection for Deep Speaker Embeddings
Zhuo Li
Runqiu Xiao
Zi-qiang Zhang
Zhenduo Zhao
Wenchao Wang
Pengyuan Zhang
19
0
0
25 Apr 2022
EMOCA: Emotion Driven Monocular Face Capture and Animation
EMOCA: Emotion Driven Monocular Face Capture and Animation
Radek Daněček
Michael J. Black
Timo Bolkart
CVBM
3DH
34
200
0
24 Apr 2022
Towards Metrical Reconstruction of Human Faces
Towards Metrical Reconstruction of Human Faces
Wojciech Zielonka
Timo Bolkart
Justus Thies
CVBM
3DH
36
144
0
13 Apr 2022
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset
  and Multimodal Method for Temporal Forgery Localization
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery Localization
Zhixi Cai
Kalin Stefanov
Abhinav Dhall
Munawar Hayat
20
3
0
13 Apr 2022
Audio-Visual Person-of-Interest DeepFake Detection
Audio-Visual Person-of-Interest DeepFake Detection
D. Cozzolino
Alessandro Pianese
Matthias Nießner
L. Verdoliva
36
60
0
06 Apr 2022
Target Confusion in End-to-end Speaker Extraction: Analysis and
  Approaches
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches
Zifeng Zhao
Dongchao Yang
Rongzhi Gu
Haoran Zhang
Yuexian Zou
23
16
0
04 Apr 2022
Frequency and Multi-Scale Selective Kernel Attention for Speaker
  Verification
Frequency and Multi-Scale Selective Kernel Attention for Speaker Verification
Sung Hwan Mun
Jee-weon Jung
Min Hyun Han
N. Kim
50
21
0
03 Apr 2022
Residual-guided Personalized Speech Synthesis based on Face Image
Residual-guided Personalized Speech Synthesis based on Face Image
Jianrong Wang
Zixuan Wang
Xiaosheng Hu
Xuewei Li
Qiang Fang
Li Liu
CVBM
24
16
0
01 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement
  by Re-Synthesis
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
16
32
0
31 Mar 2022
Improved Relation Networks for End-to-End Speaker Verification and
  Identification
Improved Relation Networks for End-to-End Speaker Verification and Identification
Ashutosh Chaubey
Sparsh Sinha
Susmita Ghose
19
3
0
31 Mar 2022
A Comparative Study of Fusion Methods for SASV Challenge 2022
A Comparative Study of Fusion Methods for SASV Challenge 2022
Petr Grinberg
V. Shikhov
AAML
26
2
0
31 Mar 2022
Generation of Speaker Representations Using Heterogeneous Training Batch
  Assembly
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly
Yu-Huai Peng
Hung-Shin Lee
Pin-Tuan Huang
Hsin-Min Wang
19
0
0
30 Mar 2022
Spoofing-Aware Speaker Verification by Multi-Level Fusion
Spoofing-Aware Speaker Verification by Multi-Level Fusion
Haibin Wu
Lingwei Meng
Jiawen Kang
Jinchao Li
Xu Li
Xixin Wu
Hung-yi Lee
Helen Meng
22
8
0
29 Mar 2022
Decomposed Temporal Dynamic CNN: Efficient Time-Adaptive Network for
  Text-Independent Speaker Verification Explained with Speaker Activation Map
Decomposed Temporal Dynamic CNN: Efficient Time-Adaptive Network for Text-Independent Speaker Verification Explained with Speaker Activation Map
Seong-Hu Kim
Hyeonuk Nam
Yong-Hwa Park
22
9
0
29 Mar 2022
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic
  Speaker Verification
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification
Yang Zhang
Zhiqiang Lv
Haibin Wu
Shanshan Zhang
Pengfei Hu
Zhiyong Wu
Hung-yi Lee
Helen Meng
ViT
24
130
0
29 Mar 2022
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Sergey Novoselov
G. Lavrentyeva
Anastasia Avdeeva
V. Volokhov
Aleksei Gusev
ViT
15
18
0
28 Mar 2022
End-to-End Active Speaker Detection
End-to-End Active Speaker Detection
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
24
27
0
27 Mar 2022
WaveFuzz: A Clean-Label Poisoning Attack to Protect Your Voice
WaveFuzz: A Clean-Label Poisoning Attack to Protect Your Voice
Yunjie Ge
Qianqian Wang
Jingfeng Zhang
Juntao Zhou
Yunzhu Zhang
Chao Shen
AAML
22
6
0
25 Mar 2022
3D GAN Inversion for Controllable Portrait Image Animation
3D GAN Inversion for Controllable Portrait Image Animation
Connor Z. Lin
David B. Lindell
E. R. Chan
Gordon Wetzstein
3DH
21
61
0
25 Mar 2022
Estimation of speaker age and height from speech signal using bi-encoder
  transformer mixture model
Estimation of speaker age and height from speech signal using bi-encoder transformer mixture model
Tarun Gupta
Duc-Tuan Truong
Tran The Anh
Chng Eng Siong
24
14
0
22 Mar 2022
Automated detection of foreground speech with wearable sensing in
  everyday home environments: A transfer learning approach
Automated detection of foreground speech with wearable sensing in everyday home environments: A transfer learning approach
Dawei Liang
Zifan Xu
Yinuo Chen
Rebecca Adaimi
David Harwath
Edison Thomaz
40
1
0
21 Mar 2022
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
46
6
0
21 Mar 2022
TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
Ruiteng Zhang
Jianguo Wei
Xugang Lu
Wenhuan Lu
Di Jin
Junhai Xu
Lin Zhang
Y. Ji
J. Dang
20
4
0
17 Mar 2022
Pushing the limits of raw waveform speaker recognition
Pushing the limits of raw waveform speaker recognition
Jee-weon Jung
You Jin Kim
Hee-Soo Heo
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
31
87
0
16 Mar 2022
SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker
  Verification System
SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker Verification System
Zhongwei Teng
Quchen Fu
Jules White
Maria E. Powell
Douglas C. Schmidt
30
11
0
12 Mar 2022
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
Juan F. Montesinos
V. S. Kadandale
G. Haro
ViT
23
19
0
08 Mar 2022
Audio-visual speech separation based on joint feature representation
  with cross-modal attention
Audio-visual speech separation based on joint feature representation with cross-modal attention
Jun Xiong
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
Yanni Zhang
20
3
0
05 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
Magnitude-aware Probabilistic Speaker Embeddings
Magnitude-aware Probabilistic Speaker Embeddings
Nikita Kuzmin
Igor Fedorov
A. Sholokhov
27
7
0
28 Feb 2022
Language-Independent Speaker Anonymization Approach using
  Self-Supervised Pre-Trained Models
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
64
25
0
26 Feb 2022
Contrastive-mixup learning for improved speaker verification
Contrastive-mixup learning for improved speaker verification
Xin Zhang
Minho Jin
R. Cheng
Ruirui Li
Eunjung Han
A. Stolcke
AAML
SSL
25
10
0
22 Feb 2022
Learnable Nonlinear Compression for Robust Speaker Verification
Learnable Nonlinear Compression for Robust Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
27
2
0
10 Feb 2022
Previous
123...111213141516
Next