Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.05622
Cited By
VoxCeleb2: Deep Speaker Recognition
14 June 2018
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VoxCeleb2: Deep Speaker Recognition"
50 / 755 papers shown
Title
USTC-KXDIGIT System Description for ASVspoof5 Challenge
Y. Chen
Haochen Wu
Nan Jiang
Xiang Xia
Qing Gu
...
Sian Fang
Yan Song
Wu Guo
Lin Liu
Minqiang Xu
41
1
0
03 Sep 2024
Interpretable Convolutional SyncNet
Sungjoon Park
Jaesub Yun
Donggeon Lee
Minsik Park
52
0
0
02 Sep 2024
Comparative Analysis of Modality Fusion Approaches for Audio-Visual Person Identification and Verification
Aref Farhadipour
Masoumeh Chapariniya
Teodora Vukovic
Volker Dellwo
34
2
0
31 Aug 2024
MegActor-
Σ
Σ
Σ
: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer
Shurong Yang
Huadong Li
Juhao Wu
Minhao Jing
Linze Li
Renhe Ji
Jiajun Liang
Haoqiang Fan
Jin Wang
VGen
DiffM
40
9
0
27 Aug 2024
The VoxCeleb Speaker Recognition Challenge: A Retrospective
Jaesung Huh
Joon Son Chung
Arsha Nagrani
A. Brown
Jee-weon Jung
Daniel Garcia-Romero
Andrew Zisserman
38
3
0
27 Aug 2024
Sample-Independent Federated Learning Backdoor Attack in Speaker Recognition
Weida Xu
Yang Xu
Sicong Zhang
FedML
AAML
41
0
0
25 Aug 2024
G3FA: Geometry-guided GAN for Face Animation
Alireza Javanmardi
A. Pagani
Didier Stricker
CVBM
3DH
34
2
0
23 Aug 2024
BUT Systems and Analyses for the ASVspoof 5 Challenge
Johan Rohdin
Lin Zhang
Oldřich Plchot
Vojtěch Staněk
David Mihola
...
Themos Stafylakis
Dmitriy Beveraki
Anna Silnova
Jan Brukner
Lukáš Burget
41
1
0
20 Aug 2024
Disentangling segmental and prosodic factors to non-native speech comprehensibility
Waris Quamer
Ricardo Gutierrez-Osuna
32
1
0
20 Aug 2024
Supervised and Unsupervised Alignments for Spoofing Behavioral Biometrics
Thomas Thebaud
Gaël Le Lan
Anthony Larcher
AAML
32
0
0
14 Aug 2024
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Xiaoxiao Miao
Yuxiang Zhang
Xin Wang
N. Tomashenko
D. Soh
Ian Mcloughlin
42
1
0
12 Aug 2024
Speech privacy-preserving methods using secret key for convolutional neural network models and their robustness evaluation
Shoko Niwa
Sayaka Shiota
Hitoshi Kiya
18
0
0
07 Aug 2024
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
Jiazhi Guan
Zhiliang Xu
Hang Zhou
Kaisiyuan Wang
Shengyi He
...
Errui Ding
Jingtuo Liu
Jingdong Wang
Youjian Zhao
Ziwei Liu
VGen
51
2
0
06 Aug 2024
Automatic Voice Identification after Speech Resynthesis using PPG
Thibault Gaudier
Marie Tahon
Anthony Larcher
Yannick Esteve
40
0
0
05 Aug 2024
Contrastive Learning-based Chaining-Cluster for Multilingual Voice-Face Association
Wuyang Chen
Yanjie Sun
Kele Xu
Yong Dou
CVBM
31
0
0
04 Aug 2024
Contextual Cross-Modal Attention for Audio-Visual Deepfake Detection and Localization
Vinaya Sree Katamneni
A. Rattani
35
4
0
02 Aug 2024
RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
Tianrui Pan
Jie Liu
Bohan Wang
Jie Tang
Gangshan Wu
40
2
0
27 Jul 2024
UniForensics: Face Forgery Detection via General Facial Representation
Ziyuan Fang
Hanqing Zhao
Tianyi Wei
Wenbo Zhou
Ming Wan
Zhanyi Wang
Weiming Zhang
Neng H. Yu
CVBM
38
1
0
26 Jul 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
39
4
0
21 Jul 2024
Anchored Diffusion for Video Face Reenactment
I. Kligvasser
Regev Cohen
G. Leifman
Ehud Rivlin
Michael Elad
DiffM
VGen
34
1
0
21 Jul 2024
AU-vMAE: Knowledge-Guide Action Units Detection via Video Masked Autoencoder
Qiaoqiao Jin
Rui Shi
Yishun Dou
Bingbing Ni
CVBM
51
0
0
16 Jul 2024
Learning Natural Consistency Representation for Face Forgery Video Detection
Daichi Zhang
Zihao Xiao
Shikun Li
Fanzhao Lin
Jianmin Li
Shiming Ge
CVBM
34
10
0
15 Jul 2024
Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification
Li Lyna Zhang
Ning Jiang
Qing Wang
Yuehong Li
Quan Lu
Lei Xie
36
6
0
14 Jul 2024
Phonetic Richness for Improved Automatic Speaker Verification
Nicholas Klein
Ganesh Sivaraman
Elie Khoury
24
0
0
10 Jul 2024
MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices
Jianwen Jiang
Gaojie Lin
Zhengkun Rong
Chao Liang
Yongming Zhu
Jiaqi Yang
Tianyun Zhong
3DH
90
8
0
08 Jul 2024
A Benchmark for Multi-speaker Anonymization
Xiaoxiao Miao
Ruijie Tao
Chang Zeng
Xin Wang
44
1
0
08 Jul 2024
We Need Variations in Speech Synthesis: Sub-center Modelling for Speaker Embeddings
Ismail Rasim Ulgen
Carlos Busso
John H. L. Hansen
Berrak Sisman
29
1
0
05 Jul 2024
Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
Sungnyun Kim
Kangwook Jang
Sangmin Bae
Hoirin Kim
Se-Young Yun
44
3
0
04 Jul 2024
GMM-ResNext: Combining Generative and Discriminative Models for Speaker Verification
Hui Yan
Zhenchun Lei
Changhong Liu
Yong Zhou
21
2
0
03 Jul 2024
Probing the Feasibility of Multilingual Speaker Anonymization
Sarina Meyer
Florian Lux
Ngoc Thang Vu
44
3
0
03 Jul 2024
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
Hiroshi Sato
Takafumi Moriya
Masato Mimura
Shota Horiguchi
Tsubasa Ochiai
Takanori Ashihara
Atsushi Ando
Kentaro Shinayama
Marc Delcroix
35
1
0
01 Jul 2024
Leveraging Speaker Embeddings in End-to-End Neural Diarization for Two-Speaker Scenarios
Juan Ignacio Alvarez-Trejos
Beltrán Labrador
Alicia Lozano-Diez
35
1
0
01 Jul 2024
Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert
Han EunGi
Oh Hyun-Bin
Kim Sung-Bin
Corentin Nivelet Etcheberry
Suekyeong Nam
Janghoon Joo
Tae-Hyun Oh
23
5
0
01 Jul 2024
An Attribute Interpolation Method in Speech Synthesis by Model Merging
Masato Murata
Koichi Miyazaki
Tomoki Koriyama
MoMe
37
4
0
30 Jun 2024
Application of ASV for Voice Identification after VC and Duration Predictor Improvement in TTS Models
Borodin Kirill Nikolayevich
Kudryavtsev Vasiliy Dmitrievich
Mkrtchian Grach Maratovich
Gorodnichev Mikhail Genadievich
Korzh Dmitrii Sergeevich
33
0
0
27 Jun 2024
Fairness and Bias in Multimodal AI: A Survey
Tosin P. Adewumi
Lama Alkhaled
Namrata Gurung
G. V. Boven
Irene Pagliai
58
9
0
27 Jun 2024
MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Adriana Fernandez-Lopez
Honglie Chen
Pingchuan Ma
Lu Yin
Q. Xiao
Stavros Petridis
Shiwei Liu
Maja Pantic
46
2
0
25 Jun 2024
Disentangled Representation Learning for Environment-agnostic Speaker Recognition
Kihyun Nam
Hee-Soo Heo
Jee-weon Jung
Joon Son Chung
50
0
0
20 Jun 2024
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset
Kim Sung-Bin
Lee Chae-Yeon
Gihun Son
Oh Hyun-Bin
Janghoon Ju
Suekyeong Nam
Tae-Hyun Oh
36
11
0
20 Jun 2024
CEC: A Noisy Label Detection Method for Speaker Recognition
Yao Shen
Yingying Gao
Yaqian Hao
Chenguang Hu
Fulin Zhang
Junlan Feng
Shilei Zhang
NoLa
34
0
0
19 Jun 2024
SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
Young Jin Ahn
Jungwoo Park
Sangha Park
Jonghyun Choi
Kee-Eung Kim
34
7
0
18 Jun 2024
AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Vahid Ahmadi Kalkhorani
Cheng Yu
Anurag Kumar
Ke Tan
Buye Xu
DeLiang Wang
32
0
0
17 Jun 2024
Self-Distillation Prototypes Network: Learning Robust Speaker Representations without Supervision
Yafeng Chen
Siqi Zheng
Hui Wang
Luyao Cheng
Qian Chen
Shiliang Zhang
Wen Wang
SSL
29
2
0
17 Jun 2024
A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing
Ming Meng
Yufei Zhao
Bo Zhang
Yonggui Zhu
Weimin Shi
Maxwell Wen
Zhaoxin Fan
VGen
42
1
0
15 Jun 2024
CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition Challenge
Chen Chen
Zehua Liu
Xiaolou Li
Lantian Li
D. Wang
35
2
0
14 Jun 2024
Personalized Speech Enhancement Without a Separate Speaker Embedding Model
Tanel Pärnamaa
Ando Saabas
36
1
0
14 Jun 2024
Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech
Martina Valente
Fabio Brugnara
Giovanni Morrone
Enrico Zovato
Leonardo Badino
35
0
0
13 Jun 2024
FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching
Chaeyoung Jung
Suyeon Lee
Ji-Hoon Kim
Joon Son Chung
DiffM
47
4
0
13 Jun 2024
End-to-end Streaming model for Low-Latency Speech Anonymization
Waris Quamer
Ricardo Gutierrez-Osuna
31
0
0
13 Jun 2024
Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding
Rui Wang
Liping Chen
Kong AiK Lee
Zhen-Hua Ling
23
2
0
12 Jun 2024
Previous
1
2
3
4
5
6
...
14
15
16
Next