Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.05622
Cited By
VoxCeleb2: Deep Speaker Recognition
14 June 2018
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VoxCeleb2: Deep Speaker Recognition"
50 / 773 papers shown
Title
A Multi-tasking Model of Speaker-Keyword Classification for Keeping Human in the Loop of Drone-assisted Inspection
Yu Li
Anisha Parsan
Bill Wang
Penghao Dong
Shanshan Yao
Ruwen Qin
29
5
0
08 Jul 2022
Generating gender-ambiguous voices for privacy-preserving speech recognition
Dimitrios Stoidis
Andrea Cavallaro
36
14
0
03 Jul 2022
Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion
Ahmad Aloradi
Wolfgang Mack
Mohamed Elminshawi
Emanuel Habets
35
5
0
28 Jun 2022
Domain Agnostic Few-shot Learning for Speaker Verification
Seunghan Yang
Debasmit Das
Jang Hyun Cho
Hyoungwoo Park
Sungrack Yun
OOD
19
7
0
28 Jun 2022
Learning from human perception to improve automatic speaker verification in style-mismatched conditions
Amber Afshan
Abeer Alwan
VLM
23
1
0
28 Jun 2022
Rethinking Audio-visual Synchronization for Active Speaker Detection
Abudukelimu Wuerkaixi
You Zhang
Z. Duan
Changshui Zhang
18
10
0
21 Jun 2022
Self-Supervised Learning for Videos: A Survey
Madeline Chantry Schiappa
Yogesh S Rawat
M. Shah
SSL
36
131
0
18 Jun 2022
Realistic One-shot Mesh-based Head Avatars
Taras Khakhulin
V. Sklyarova
Victor Lempitsky
Egor Zakharov
3DH
34
97
0
16 Jun 2022
AS2T: Arbitrary Source-To-Target Adversarial Attack on Speaker Recognition Systems
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Yang Liu
AAML
35
18
0
07 Jun 2022
DT-SV: A Transformer-based Time-domain Approach for Speaker Verification
Nan Zhang
Jianzong Wang
Zhenhou Hong
Chendong Zhao
Xiaoyang Qu
Jing Xiao
37
5
0
26 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
32
8
0
19 May 2022
The VoicePrivacy 2020 Challenge Evaluation Plan
N. Tomashenko
B. M. L. Srivastava
Xin Wang
Emmanuel Vincent
A. Nautsch
...
Nicholas W. D. Evans
J. Patino
J. Bonastre
Paul-Gauthier Noé
Massimiliano Todisco
32
43
0
14 May 2022
Collar-aware Training for Streaming Speaker Change Detection in Broadcast Speech
Joonas Kalda
Tanel Alumäe
13
3
0
14 May 2022
F3A-GAN: Facial Flow for Face Animation with Generative Adversarial Networks
Xintian Wu
Qihang Zhang
Yiming Wu
Huanyu Wang
Songyuan Li
Lingyun Sun
Xi Li
CVBM
3DH
39
7
0
12 May 2022
Efficient dynamic filter for robust and low computational feature extraction
Donghyeon Kim
Gwantae Kim
Bokyeung Lee
Jeong-gi Kwak
D. Han
Hanseok Ko
28
3
0
03 May 2022
Emotion-Controllable Generalized Talking Face Generation
Sanjana Sinha
S. Biswas
Ravindra Yadav
Brojeshwar Bhowmick
CVBM
18
49
0
02 May 2022
Baselines and Protocols for Household Speaker Recognition
A. Sholokhov
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
25
4
0
30 Apr 2022
Sound Localization by Self-Supervised Time Delay Estimation
Ziyang Chen
David Fouhey
Andrew Owens
SSL
27
19
0
26 Apr 2022
Back-ends Selection for Deep Speaker Embeddings
Zhuo Li
Runqiu Xiao
Zi-qiang Zhang
Zhenduo Zhao
Wenchao Wang
Pengyuan Zhang
19
0
0
25 Apr 2022
EMOCA: Emotion Driven Monocular Face Capture and Animation
Radek Daněček
Michael J. Black
Timo Bolkart
CVBM
3DH
34
200
0
24 Apr 2022
Towards Metrical Reconstruction of Human Faces
Wojciech Zielonka
Timo Bolkart
Justus Thies
CVBM
3DH
36
144
0
13 Apr 2022
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery Localization
Zhixi Cai
Kalin Stefanov
Abhinav Dhall
Munawar Hayat
20
3
0
13 Apr 2022
Audio-Visual Person-of-Interest DeepFake Detection
D. Cozzolino
Alessandro Pianese
Matthias Nießner
L. Verdoliva
36
60
0
06 Apr 2022
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches
Zifeng Zhao
Dongchao Yang
Rongzhi Gu
Haoran Zhang
Yuexian Zou
23
16
0
04 Apr 2022
Frequency and Multi-Scale Selective Kernel Attention for Speaker Verification
Sung Hwan Mun
Jee-weon Jung
Min Hyun Han
N. Kim
50
21
0
03 Apr 2022
Residual-guided Personalized Speech Synthesis based on Face Image
Jianrong Wang
Zixuan Wang
Xiaosheng Hu
Xuewei Li
Qiang Fang
Li Liu
CVBM
24
16
0
01 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
16
32
0
31 Mar 2022
Improved Relation Networks for End-to-End Speaker Verification and Identification
Ashutosh Chaubey
Sparsh Sinha
Susmita Ghose
19
3
0
31 Mar 2022
A Comparative Study of Fusion Methods for SASV Challenge 2022
Petr Grinberg
V. Shikhov
AAML
26
2
0
31 Mar 2022
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly
Yu-Huai Peng
Hung-Shin Lee
Pin-Tuan Huang
Hsin-Min Wang
19
0
0
30 Mar 2022
Spoofing-Aware Speaker Verification by Multi-Level Fusion
Haibin Wu
Lingwei Meng
Jiawen Kang
Jinchao Li
Xu Li
Xixin Wu
Hung-yi Lee
Helen Meng
22
8
0
29 Mar 2022
Decomposed Temporal Dynamic CNN: Efficient Time-Adaptive Network for Text-Independent Speaker Verification Explained with Speaker Activation Map
Seong-Hu Kim
Hyeonuk Nam
Yong-Hwa Park
22
9
0
29 Mar 2022
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification
Yang Zhang
Zhiqiang Lv
Haibin Wu
Shanshan Zhang
Pengfei Hu
Zhiyong Wu
Hung-yi Lee
Helen Meng
ViT
24
130
0
29 Mar 2022
Robust Speaker Recognition with Transformers Using wav2vec 2.0
Sergey Novoselov
G. Lavrentyeva
Anastasia Avdeeva
V. Volokhov
Aleksei Gusev
ViT
15
18
0
28 Mar 2022
End-to-End Active Speaker Detection
Juan Carlos León Alcázar
M. Cordes
Chen Zhao
Guohao Li
24
27
0
27 Mar 2022
WaveFuzz: A Clean-Label Poisoning Attack to Protect Your Voice
Yunjie Ge
Qianqian Wang
Jingfeng Zhang
Juntao Zhou
Yunzhu Zhang
Chao Shen
AAML
22
6
0
25 Mar 2022
3D GAN Inversion for Controllable Portrait Image Animation
Connor Z. Lin
David B. Lindell
E. R. Chan
Gordon Wetzstein
3DH
21
61
0
25 Mar 2022
Estimation of speaker age and height from speech signal using bi-encoder transformer mixture model
Tarun Gupta
Duc-Tuan Truong
Tran The Anh
Chng Eng Siong
24
14
0
22 Mar 2022
Automated detection of foreground speech with wearable sensing in everyday home environments: A transfer learning approach
Dawei Liang
Zifan Xu
Yinuo Chen
Rebecca Adaimi
David Harwath
Edison Thomaz
40
1
0
21 Mar 2022
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
46
6
0
21 Mar 2022
TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding
Ruiteng Zhang
Jianguo Wei
Xugang Lu
Wenhuan Lu
Di Jin
Junhai Xu
Lin Zhang
Y. Ji
J. Dang
20
4
0
17 Mar 2022
Pushing the limits of raw waveform speaker recognition
Jee-weon Jung
You Jin Kim
Hee-Soo Heo
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
31
87
0
16 Mar 2022
SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker Verification System
Zhongwei Teng
Quchen Fu
Jules White
Maria E. Powell
Douglas C. Schmidt
30
11
0
12 Mar 2022
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
Juan F. Montesinos
V. S. Kadandale
G. Haro
ViT
23
19
0
08 Mar 2022
Audio-visual speech separation based on joint feature representation with cross-modal attention
Jun Xiong
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
Yanni Zhang
20
3
0
05 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
Magnitude-aware Probabilistic Speaker Embeddings
Nikita Kuzmin
Igor Fedorov
A. Sholokhov
27
7
0
28 Feb 2022
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
64
25
0
26 Feb 2022
Contrastive-mixup learning for improved speaker verification
Xin Zhang
Minho Jin
R. Cheng
Ruirui Li
Eunjung Han
A. Stolcke
AAML
SSL
25
10
0
22 Feb 2022
Learnable Nonlinear Compression for Robust Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
27
2
0
10 Feb 2022
Previous
1
2
3
...
11
12
13
14
15
16
Next