Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.05622
Cited By
VoxCeleb2: Deep Speaker Recognition
14 June 2018
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VoxCeleb2: Deep Speaker Recognition"
50 / 750 papers shown
Title
Zero-Shot Fake Video Detection by Audio-Visual Consistency
Xiaolou Li
Zehua Liu
Chen Chen
Lantian Li
Li Guo
D. Wang
55
4
0
12 Jun 2024
Target Speaker Extraction with Curriculum Learning
Yun Liu
Xuechen Liu
Xiaoxiao Miao
Junichi Yamagishi
21
3
0
12 Jun 2024
MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Seung-bin Kim
Chan-yeong Lim
Jungwoo Heo
Ju-ho Kim
Hyun-Seo Shin
Kyo-Won Koo
Ha-Jin Yu
52
0
0
11 Jun 2024
Source -Free Domain Adaptation for Speaker Verification in Data-Scarce Languages and Noisy Channels
Shlomo Salo Elia
Aviad Malachi
V. Aharonson
Gadi Pinkas
29
0
0
09 Jun 2024
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Guanrou Yang
Ziyang Ma
Fan Yu
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
38
2
0
09 Jun 2024
Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization
Bei Liu
Haoyu Wang
Yanmin Qian
MQ
33
1
0
08 Jun 2024
To what extent can ASV systems naturally defend against spoofing attacks?
Jee-weon Jung
Xin Eric Wang
Nicholas W. D. Evans
Shinji Watanabe
Hye-jin Shim
Hemlata Tak
Sidhhant Arora
Junichi Yamagishi
Joon Son Chung
AAML
38
3
0
08 Jun 2024
Neural Codec-based Adversarial Sample Detection for Speaker Verification
Xuanjun Chen
Jiawei Du
Haibin Wu
Jyh-Shing Roger Jang
Hung-yi Lee
34
2
0
07 Jun 2024
LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
Sreyan Ghosh
Sonal Kumar
Ashish Seth
Purva Chiniya
Utkarsh Tyagi
R. Duraiswami
Dinesh Manocha
41
0
0
06 Jun 2024
InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender Segmentation
D. Doukhan
Christine Maertens
William Le Personnic
Ludovic Speroni
Reda Dehak
35
2
0
06 Jun 2024
Hypernetworks for Personalizing ASR to Atypical Speech
Max Müller-Eberstein
Dianna Yee
Karren D. Yang
G. Mantena
Colin S. Lea
33
0
0
06 Jun 2024
AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
Trevine Oorloff
Surya Koppisetti
Nicolò Bonettini
Divyaraj Solanki
Ben Colman
Yaser Yacoob
Ali Shahriyari
Gaurav Bharaj
37
21
0
05 Jun 2024
Towards Supervised Performance on Speaker Verification with Self-Supervised Learning by Leveraging Large-Scale ASR Models
Victor Miara
Theo Lepage
Reda Dehak
29
1
0
04 Jun 2024
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Davide Berghi
Philip J. B. Jackson
42
0
0
01 Jun 2024
ComFace: Facial Representation Learning with Synthetic Data for Comparing Faces
Yusuke Akamatsu
Terumi Umematsu
Hitoshi Imaoka
Shizuko Gomi
Hideo Tsurushima
94
0
0
25 May 2024
HiddenSpeaker: Generate Imperceptible Unlearnable Audios for Speaker Verification System
Zhisheng Zhang
Pengyang Huang
AAML
29
3
0
24 May 2024
Non-autoregressive real-time Accent Conversion model with voice cloning
Vladimir Nechaev
Sergey Kosyakov
39
1
0
21 May 2024
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control
Yue Han
Junwei Zhu
Keke He
Xu Chen
Yanhao Ge
Wei Li
Xiangtai Li
Jiangning Zhang
Chengjie Wang
Yong Liu
DiffM
50
25
0
21 May 2024
Neighborhood Attention Transformer with Progressive Channel Fusion for Speaker Verification
Nian Li
Jianguo Wei
ViT
32
0
0
20 May 2024
Multi-speaker Text-to-speech Training with Speaker Anonymized Data
Wen-Chin Huang
Yi-Chiao Wu
T. Toda
40
1
0
20 May 2024
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text
Youngjoon Jang
Ji-Hoon Kim
Junseok Ahn
Doyeop Kwak
Hong-Sun Yang
Yooncheol Ju
Il-Hwan Kim
Byeong-Yeol Kim
Joon Son Chung
CVBM
31
9
0
16 May 2024
Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization
Jenthe Thienpondt
Kris Demuynck
41
2
0
15 May 2024
PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset
Yang Hou
Haitao Fu
Chuankai Chen
Zida Li
Haoyu Zhang
Jianjun Zhao
29
3
0
14 May 2024
Speaker Characterization by means of Attention Pooling
Federico Costa
Miquel India
Javier Hernando
25
1
0
07 May 2024
Exploring Self-Supervised Vision Transformers for Deepfake Detection: A Comparative Analysis
H. Nguyen
Junichi Yamagishi
Isao Echizen
39
6
0
01 May 2024
Towards Real-world Video Face Restoration: A New Benchmark
Ziyan Chen
Jingwen He
Xinqi Lin
Yu Qiao
Chao Dong
44
4
0
30 Apr 2024
EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
Nikita Drobyshev
Antoni Bigata Casademunt
Konstantinos Vougioukas
Zoe Landgraf
Stavros Petridis
Maja Pantic
44
23
0
29 Apr 2024
Certification of Speaker Recognition Models to Additive Perturbations
Dmitrii Korzh
Elvir Karimov
Mikhail Aleksandrovich Pautov
Oleg Y. Rogov
Ivan V. Oseledets
50
1
0
29 Apr 2024
Towards Dog Bark Decoding: Leveraging Human Speech Processing for Automated Bark Classification
Artem Abzaliev
Humberto Pérez Espinosa
Rada Mihalcea
VLM
25
1
0
29 Apr 2024
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Ruijie Tao
Xinyuan Qian
Yidi Jiang
Junjie Li
Jiadong Wang
Haizhou Li
34
1
0
29 Apr 2024
A Comparison of Differential Performance Metrics for the Evaluation of Automatic Speaker Verification Fairness
Oubaïda Chouchane
Christoph Busch
Chiara Galdi
Nicholas W. D. Evans
Massimiliano Todisco
37
1
0
27 Apr 2024
A Semi-Automatic Approach to Create Large Gender- and Age-Balanced Speaker Corpora: Usefulness of Speaker Diarization & Identification
Rémi Uro
D. Doukhan
Albert Rilliard
Laëtitia Larcher
Anissa-Claire Adgharouamane
Marie Tahon
Antoine Laurent
47
4
0
26 Apr 2024
MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition
Zheng Lian
Haiyang Sun
Guoying Zhao
Zhuofan Wen
Siyuan Zhang
...
Bin Liu
Erik Cambria
Guoying Zhao
Björn W. Schuller
Jianhua Tao
VLM
31
11
0
26 Apr 2024
Voice Passing : a Non-Binary Voice Gender Prediction System for evaluating Transgender voice transition
D. Doukhan
Simon Devauchelle
Lucile Girard-Monneron
Mía Chávez Ruz
V. Chaddouk
Isabelle Wagner
Albert Rilliard
19
1
0
23 Apr 2024
GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting
Hongyun Yu
Zhan Qu
Qihang Yu
Jianchuan Chen
Zhonghua Jiang
...
Shengyu Zhang
Jimin Xu
Fei Wu
Chengfei Lv
Gang Yu
3DGS
35
12
0
22 Apr 2024
Audio Anti-Spoofing Detection: A Survey
Menglu Li
Yasaman Ahmadiadli
Xiao-Ping Zhang
46
17
0
22 Apr 2024
Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction
Zhaoxi Mu
Xinyu Yang
34
5
0
19 Apr 2024
Multi-Task Multi-Modal Self-Supervised Learning for Facial Expression Recognition
Marah Halawa
Florian Blume
Pia Bideau
Martin Maier
Rasha Abdel Rahman
Olaf Hellwich
CVBM
36
1
0
16 Apr 2024
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Sicheng Xu
Guojun Chen
Yu-Xiao Guo
Jiaolong Yang
Chong Li
Zhenyu Zang
Yizhong Zhang
Xin Tong
Baining Guo
45
87
0
16 Apr 2024
3D Face Tracking from 2D Video through Iterative Dense UV to Image Flow
Felix Taubner
Prashant Raina
Mathieu Tuli
Eu Wern Teh
Chul Lee
Jinmiao Huang
3DH
CVBM
46
4
0
15 Apr 2024
FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features
Andre Rochow
Max Schwarz
Sven Behnke
ViT
48
6
0
15 Apr 2024
Fuse after Align: Improving Face-Voice Association Learning via Multimodal Encoder
Chong Peng
Liqiang He
Dan Su
CVBM
31
0
0
15 Apr 2024
The VoicePrivacy 2024 Challenge Evaluation Plan
N. Tomashenko
Xiaoxiao Miao
Pierre Champion
Sarina Meyer
Xin Wang
Emmanuel Vincent
Michele Panariello
Nicholas W. D. Evans
Junichi Yamagishi
Massimiliano Todisco
36
21
0
03 Apr 2024
BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech Recognition
A. Haliassos
Andreas Zinonos
Rodrigo Mira
Stavros Petridis
Maja Pantic
VLM
SSL
AI4TS
39
12
0
02 Apr 2024
Zero-Shot Multi-Lingual Speaker Verification in Clinical Trials
Ali Akram
Marija Stanojevic
Malikeh Ehghaghi
Jekaterina Novikova
22
0
0
02 Apr 2024
EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis
Shuai Tan
Bin Ji
Mengxiao Bi
Ye Pan
38
26
0
02 Apr 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
40
2
0
28 Mar 2024
Asymmetric and trial-dependent modeling: the contribution of LIA to SdSV Challenge Task 2
Pierre-Michel Bousquet
Mickael Rouvier
12
0
0
28 Mar 2024
MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation
Seyeon Kim
Siyoon Jin
Jihye Park
Kihong Kim
Jiyoung Kim
Jisu Nam
Seungryong Kim
DiffM
VGen
60
3
0
28 Mar 2024
Deepfake Generation and Detection: A Benchmark and Survey
Gan Pei
Jiangning Zhang
Menghan Hu
Zhenyu Zhang
Chengjie Wang
Yunsheng Wu
Guangtao Zhai
Jian Yang
Chunhua Shen
Dacheng Tao
52
25
0
26 Mar 2024
Previous
1
2
3
4
5
...
13
14
15
Next