Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.08612
Cited By
VoxCeleb: a large-scale speaker identification dataset
26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VoxCeleb: a large-scale speaker identification dataset"
50 / 1,098 papers shown
Title
A Benchmark for Multi-speaker Anonymization
Xiaoxiao Miao
Ruijie Tao
Chang Zeng
Xin Wang
49
1
0
08 Jul 2024
GMM-ResNext: Combining Generative and Discriminative Models for Speaker Verification
Hui Yan
Zhenchun Lei
Changhong Liu
Yong Zhou
34
2
0
03 Jul 2024
Probing the Feasibility of Multilingual Speaker Anonymization
Sarina Meyer
Florian Lux
Ngoc Thang Vu
50
3
0
03 Jul 2024
LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control
Jianzhu Guo
Dingyun Zhang
Xiaoqiang Liu
Zhizhou Zhong
Yuan Zhang
Pengfei Wan
Di Zhang
VGen
65
54
0
03 Jul 2024
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
Hiroshi Sato
Takafumi Moriya
Masato Mimura
Shota Horiguchi
Tsubasa Ochiai
Takanori Ashihara
Atsushi Ando
Kentaro Shinayama
Marc Delcroix
43
1
0
01 Jul 2024
Leveraging Speaker Embeddings in End-to-End Neural Diarization for Two-Speaker Scenarios
Juan Ignacio Alvarez-Trejos
Beltrán Labrador
Alicia Lozano-Diez
38
1
0
01 Jul 2024
SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures
Oguzhan Baser
Kaan Kale
Sandeep Chinchali
34
0
0
01 Jul 2024
An Attribute Interpolation Method in Speech Synthesis by Model Merging
Masato Murata
Koichi Miyazaki
Tomoki Koriyama
MoMe
45
5
0
30 Jun 2024
Application of ASV for Voice Identification after VC and Duration Predictor Improvement in TTS Models
Borodin Kirill Nikolayevich
Kudryavtsev Vasiliy Dmitrievich
Mkrtchian Grach Maratovich
Gorodnichev Mikhail Genadievich
Korzh Dmitrii Sergeevich
44
0
0
27 Jun 2024
WavRx: a Disease-Agnostic, Generalizable, and Privacy-Preserving Speech Health Diagnostic Model
Yi Zhu
Tiago H. Falk
MedIm
41
0
0
26 Jun 2024
RealTalk: Real-time and Realistic Audio-driven Face Generation with 3D Facial Prior-guided Identity Alignment Network
Xiaozhong Ji
Chuming Lin
Zhonggan Ding
Ying Tai
Junwei Zhu
Xiaobin Hu
Donghao Luo
Yanhao Ge
Chengjie Wang
CVBM
32
2
0
26 Jun 2024
AudioBench: A Universal Benchmark for Audio Large Language Models
Bin Wang
Xunlong Zou
Geyu Lin
Shri Kiran Srinivasan
Zhuohan Liu
Wenyu Zhang
Zhengyuan Liu
AiTi Aw
Nancy F. Chen
AuLLM
ELM
LM&MA
92
23
0
23 Jun 2024
GLOBE: A High-quality English Corpus with Global Accents for Zero-shot Speaker Adaptive Text-to-Speech
Wenbin Wang
Yang Song
Sanjay Jha
44
6
0
21 Jun 2024
DASB -- Discrete Audio and Speech Benchmark
Pooneh Mousavi
Luca Della Libera
J. Duret
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
35
13
0
20 Jun 2024
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset
Kim Sung-Bin
Lee Chae-Yeon
Gihun Son
Oh Hyun-Bin
Janghoon Ju
Suekyeong Nam
Tae-Hyun Oh
36
12
0
20 Jun 2024
CEC: A Noisy Label Detection Method for Speaker Recognition
Yao Shen
Yingying Gao
Yaqian Hao
Chenguang Hu
Fulin Zhang
Junlan Feng
Shilei Zhang
NoLa
34
0
0
19 Jun 2024
Articulatory Encodec: Coding Speech through Vocal Tract Kinematics
Cheol Jun Cho
Peter Wu
Tejas S. Prabhune
Dhruv Agarwal
Gopala K. Anumanchipalli
36
2
0
18 Jun 2024
Self-Distillation Prototypes Network: Learning Robust Speaker Representations without Supervision
Yafeng Chen
Siqi Zheng
Hui Wang
Luyao Cheng
Qian Chen
Shiliang Zhang
Wen Wang
SSL
29
2
0
17 Jun 2024
How Should We Extract Discrete Audio Tokens from Self-Supervised Models?
Pooneh Mousavi
J. Duret
Salah Zaiem
Luca Della Libera
Artem Ploujnikov
Cem Subakan
Mirco Ravanelli
42
10
0
15 Jun 2024
CNVSRC 2023: The First Chinese Continuous Visual Speech Recognition Challenge
Chen Chen
Zehua Liu
Xiaolou Li
Lantian Li
D. Wang
35
2
0
14 Jun 2024
End-to-end Streaming model for Low-Latency Speech Anonymization
Waris Quamer
Ricardo Gutierrez-Osuna
39
0
0
13 Jun 2024
Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions
Anfeng Xu
Kevin Huang
Tiantian Feng
Lue Shen
Helen Tager-Flusberg
Shrikanth Narayanan
35
2
0
12 Jun 2024
A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition
Zhenyu Zhou
Shibiao Xu
Shi Yin
Lantian Li
D. Wang
42
0
0
11 Jun 2024
MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Seung-bin Kim
Chan-yeong Lim
Jungwoo Heo
Ju-ho Kim
Hyun-Seo Shin
Kyo-Won Koo
Ha-Jin Yu
52
0
0
11 Jun 2024
Scaling up masked audio encoder learning for general audio classification
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
50
3
0
11 Jun 2024
Source -Free Domain Adaptation for Speaker Verification in Data-Scarce Languages and Noisy Channels
Shlomo Salo Elia
Aviad Malachi
V. Aharonson
Gadi Pinkas
29
0
0
09 Jun 2024
Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization
Bei Liu
Haoyu Wang
Yanmin Qian
MQ
40
1
0
08 Jun 2024
To what extent can ASV systems naturally defend against spoofing attacks?
Jee-weon Jung
Xin Eric Wang
Nicholas W. D. Evans
Shinji Watanabe
Hye-jin Shim
Hemlata Tak
Sidhhant Arora
Junichi Yamagishi
Joon Son Chung
AAML
46
3
0
08 Jun 2024
Neural Codec-based Adversarial Sample Detection for Speaker Verification
Xuanjun Chen
Jiawei Du
Haibin Wu
Jyh-Shing Roger Jang
Hung-yi Lee
40
2
0
07 Jun 2024
InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender Segmentation
D. Doukhan
Christine Maertens
William Le Personnic
Ludovic Speroni
Reda Dehak
38
2
0
06 Jun 2024
Hypernetworks for Personalizing ASR to Atypical Speech
Max Müller-Eberstein
Dianna Yee
Karren D. Yang
G. Mantena
Colin S. Lea
33
1
0
06 Jun 2024
Towards Supervised Performance on Speaker Verification with Self-Supervised Learning by Leveraging Large-Scale ASR Models
Victor Miara
Theo Lepage
Reda Dehak
37
1
0
04 Jun 2024
M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
Masahiro Yasuda
Shunsuke Tsubaki
Keisuke Imoto
VLM
38
5
0
04 Jun 2024
ComFace: Facial Representation Learning with Synthetic Data for Comparing Faces
Yusuke Akamatsu
Terumi Umematsu
Hitoshi Imaoka
Shizuko Gomi
Hideo Tsurushima
97
0
0
25 May 2024
Non-autoregressive real-time Accent Conversion model with voice cloning
Vladimir Nechaev
Sergey Kosyakov
42
1
0
21 May 2024
Neighborhood Attention Transformer with Progressive Channel Fusion for Speaker Verification
Nian Li
Jianguo Wei
ViT
32
0
0
20 May 2024
SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
Siavash Shams
Sukru Samet Dindar
Xilin Jiang
N. Mesgarani
Mamba
74
18
0
20 May 2024
Generative Artificial Intelligence: A Systematic Review and Applications
S. S. Sengar
Affan Bin Hasan
Sanjay Kumar
Fiona Carroll
MedIm
38
52
0
17 May 2024
Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization
Jenthe Thienpondt
Kris Demuynck
41
2
0
15 May 2024
Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning
Alain Riou
Stefan Lattner
Gaëtan Hadjeres
Geoffroy Peeters
43
2
0
14 May 2024
Open Implementation and Study of BEST-RQ for Speech Processing
Ryan Whetten
Titouan Parcollet
Marco Dinarelli
Yannick Esteve
27
6
0
07 May 2024
Speaker Characterization by means of Attention Pooling
Federico Costa
Miquel India
Javier Hernando
33
1
0
07 May 2024
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding
Tao Liu
Feilong Chen
Shuai Fan
Chenpeng Du
Qi Chen
Xie Chen
Kai Yu
DiffM
PINN
36
25
0
06 May 2024
In Anticipation of Perfect Deepfake: Identity-anchored Artifact-agnostic Detection under Rebalanced Deepfake Detection Protocol
Wei-Han Wang
Chin-Yuan Yeh
Hsi-Wen Chen
De-Nian Yang
Ming-Syan Chen
56
0
0
01 May 2024
Certification of Speaker Recognition Models to Additive Perturbations
Dmitrii Korzh
Elvir Karimov
Mikhail Aleksandrovich Pautov
Oleg Y. Rogov
Ivan Oseledets
50
1
0
29 Apr 2024
Towards Dog Bark Decoding: Leveraging Human Speech Processing for Automated Bark Classification
Artem Abzaliev
Humberto Pérez Espinosa
Rada Mihalcea
VLM
28
1
0
29 Apr 2024
A Comparison of Differential Performance Metrics for the Evaluation of Automatic Speaker Verification Fairness
Oubaïda Chouchane
Christoph Busch
Chiara Galdi
Nicholas W. D. Evans
Massimiliano Todisco
42
2
0
27 Apr 2024
A Semi-Automatic Approach to Create Large Gender- and Age-Balanced Speaker Corpora: Usefulness of Speaker Diarization & Identification
Rémi Uro
D. Doukhan
Albert Rilliard
Laëtitia Larcher
Anissa-Claire Adgharouamane
Marie Tahon
Antoine Laurent
47
4
0
26 Apr 2024
3DFlowRenderer: One-shot Face Re-enactment via Dense 3D Facial Flow Estimation
Siddharth Nijhawan
T. Yashima
Tamaki Kojima
3DH
40
0
0
23 Apr 2024
AudioRepInceptionNeXt: A lightweight single-stream architecture for efficient audio recognition
Kin Wai Lau
Yasar Abbas Ur Rehman
L. Po
44
1
0
21 Apr 2024
Previous
1
2
3
4
5
6
...
20
21
22
Next