Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.08612
Cited By
VoxCeleb: a large-scale speaker identification dataset
26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VoxCeleb: a large-scale speaker identification dataset"
50 / 1,100 papers shown
Title
Audio-Visual Synchronisation in the wild
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
26
37
0
08 Dec 2021
Robust Speech Representation Learning via Flow-based Embedding Regularization
Woohyun Kang
Jahangir Alam
A. Fathan
32
3
0
07 Dec 2021
One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning
Suzhe Wang
Lincheng Li
Yueqing Ding
Xin Yu
CVBM
76
117
0
06 Dec 2021
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
185
383
0
04 Dec 2021
AVA-AVD: Audio-Visual Speaker Diarization in the Wild
Eric Z. Xu
Zeyang Song
Satoshi Tsutsui
C. Feng
Mang Ye
Mike Zheng Shou
VGen
21
42
0
29 Nov 2021
Low-Latency Online Speaker Diarization with Graph-Based Label Generation
Yucong Zhang
Qinjian Lin
Weiqing Wang
Lin Yang
Xuyang Wang
Junjie Wang
Ming Li
24
10
0
27 Nov 2021
An MAP Estimation for Between-Class Variance
Jiao Han
Yunqi Cai
Lantian Li
Guanyu Li
Dong Wang
13
0
0
24 Nov 2021
A Study on Decoupled Probabilistic Linear Discriminant Analysis
Ding Wang
Lantian Li
Hongzhi Yu
Dong Wang
11
0
0
24 Nov 2021
Towards Learning Universal Audio Representations
Luyu Wang
Pauline Luc
Yan Wu
Adrià Recasens
Lucas Smaira
...
Andrew Jaegle
Jean-Baptiste Alayrac
Sander Dieleman
João Carreira
Aaron van den Oord
SSL
32
68
0
23 Nov 2021
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Suwon Shon
Ankita Pasad
Felix Wu
Pablo Brusco
Yoav Artzi
Karen Livescu
Kyu Jeong Han
AuLLM
ELM
45
74
0
19 Nov 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
34
665
0
17 Nov 2021
Unsupervised Speech Enhancement with speech recognition embedding and disentanglement losses
V. Trinh
Sebastian Braun
33
17
0
16 Nov 2021
AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment
Kangyeol Kim
S. Park
Jaeseong Lee
Sunghyo Chung
Junsoo Lee
Jaegul Choo
3DH
CVBM
27
13
0
15 Nov 2021
MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification
Ladislav Mošner
Oldrich Plchot
L. Burget
J. Černocký
39
7
0
11 Nov 2021
Inclusive Speaker Verification with Adaptive thresholding
Navdeep Jain
Hongcheng Wang
30
0
0
10 Nov 2021
SAFA: Structure Aware Face Animation
Qiulin Wang
Luankun Zhang
Bo Li
3DH
CVBM
58
21
0
09 Nov 2021
Characterizing the adversarial vulnerability of speech self-supervised learning
Haibin Wu
Bo Zheng
Xu Li
Xixin Wu
Hung-yi Lee
Helen Meng
AAML
SSL
133
7
0
08 Nov 2021
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Sung-Feng Huang
Chyi-Jiunn Lin
Da-Rong Liu
Yi-Chen Chen
Hung-yi Lee
22
56
0
07 Nov 2021
Emotional Prosody Control for Speech Generation
S. Sivaprasad
Saiteja Kosgi
Vineet Gandhi
12
17
0
07 Nov 2021
Class Token and Knowledge Distillation for Multi-head Self-Attention Speaker Verification Systems
Victoria Mingote
A. Miguel
A. O. Giménez
EDUARDO LLEIDA SOLANO
39
10
0
06 Nov 2021
Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification
J. Málek
Jakub Janský
Zbyněk Koldovský
Tomás Kounovský
Jaroslav Cmejla
J. Zdánský
25
10
0
05 Nov 2021
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding
Yingzhi Wang
Abdelmoumene Boumadane
A. Heba
26
149
0
04 Nov 2021
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity
Peter Wu
Jiatong Shi
Yifan Zhong
Shinji Watanabe
A. Black
27
8
0
02 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
138
1,721
0
26 Oct 2021
CS-Rep: Making Speaker Verification Networks Embracing Re-parameterization
Ruiteng Zhang
Jianguo Wei
Wenhuan Lu
Lin Zhang
Y. Ji
Junhai Xu
Xugang Lu
25
10
0
26 Oct 2021
Contrastive Neural Processes for Self-Supervised Learning
Konstantinos Kallidromitis
Denis A. Gudovskiy
Kozuka Kazuki
Ohama Iku
Luca Rigazio
SSL
AI4TS
46
10
0
24 Oct 2021
A Study of Multimodal Person Verification Using Audio-Visual-Thermal Data
Madina Abdrakhmanova
Siwen Guo
Yerbolat Khassanov
Shohreh Haddadan
19
5
0
23 Oct 2021
Optimizing Multi-Taper Features for Deep Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
31
1
0
21 Oct 2021
Rep Works in Speaker Verification
Yufeng Ma
Miao Zhao
Yiwei Ding
Yu Zheng
Min Liu
Minqiang Xu
32
8
0
19 Oct 2021
Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Baolin Zheng
Peipei Jiang
Qian Wang
Qi Li
Chao Shen
Cong Wang
Yunjie Ge
Qingyang Teng
Shenyi Zhang
AAML
18
69
0
19 Oct 2021
Tackling the Score Shift in Cross-Lingual Speaker Verification by Exploiting Language Information
Jenthe Thienpondt
Brecht Desplanques
Kris Demuynck
27
8
0
18 Oct 2021
DECAR: Deep Clustering for learning general-purpose Audio Representations
Sreyan Ghosh
Sandesh V Katta
Ashish Seth
S. Umesh
SSL
36
12
0
17 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
118
194
0
14 Oct 2021
The Impact of Spatiotemporal Augmentations on Self-Supervised Audiovisual Representation Learning
Haider Al-Tahan
Y. Mohsenzadeh
SSL
AI4TS
39
0
0
13 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
281
1,026
0
13 Oct 2021
Duality Temporal-channel-frequency Attention Enhanced Speaker Representation Learning
Li Zhang
Qing Wang
Lei Xie
44
17
0
13 Oct 2021
Simple Attention Module based Speaker Verification with Iterative noisy label detection
Xiaoyi Qin
Na Li
Chao Weng
Dan Su
Ming Li
NoLa
65
50
0
13 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning
Paarth Neekhara
Jason Chun Lok Li
Boris Ginsburg
38
15
0
12 Oct 2021
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification
Zhengyang Chen
Sanyuan Chen
Yu-Huan Wu
Yao Qian
Chengyi Wang
Shujie Liu
Y. Qian
Michael Zeng
SSL
26
124
0
12 Oct 2021
Multi-View Self-Attention Based Transformer for Speaker Recognition
Rui Wang
Junyi Ao
Long Zhou
Shujie Liu
Zhihua Wei
Tom Ko
Qing Li
Yu Zhang
ViT
14
31
0
11 Oct 2021
Self-Supervised 3D Face Reconstruction via Conditional Estimation
Yandong Wen
Weiyang Liu
Bhiksha Raj
Rita Singh
CVBM
36
21
0
10 Oct 2021
Fine-grained Identity Preserving Landmark Synthesis for Face Reenactment
Haichao Zhang
Youcheng Ben
Weixiao Zhang
Tao Chen
Gang Yu
Bin-Bin Fu
CVBM
21
2
0
10 Oct 2021
Poformer: A simple pooling transformer for speaker verification
Yufeng Ma
Yiwei Ding
Miao Zhao
Yu Zheng
Min Liu
Minqiang Xu
ViT
21
2
0
10 Oct 2021
Differential Motion Evolution for Fine-Grained Motion Deformation in Unsupervised Image Animation
Peirong Liu
Rui Wang
Xuefei Cao
Yipin Zhou
Ashish Shah
Ser-Nam Lim
DiffM
39
3
0
09 Oct 2021
Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
Joel Shor
A. Jansen
Wei Han
Daniel S. Park
Yu Zhang
SSL
AI4TS
45
54
0
09 Oct 2021
Towards Lightweight Applications: Asymmetric Enroll-Verify Structure for Speaker Verification
Qingjian Lin
Lin Yang
Xuyang Wang
Xiaoyi Qin
Junjie Wang
Ming Li
30
21
0
09 Oct 2021
A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
Ge Zhu
Frank Cwitkowitz
Z. Duan
22
2
0
08 Oct 2021
Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity
You Jin Kim
Hee-Soo Heo
Jee-weon Jung
Youngki Kwon
Bong-Jin Lee
Joon Son Chung
32
3
0
07 Oct 2021
Multi-scale speaker embedding-based graph attention networks for speaker diarisation
Youngki Kwon
Hee-Soo Heo
Jee-weon Jung
You Jin Kim
Bong-Jin Lee
Joon Son Chung
46
18
0
07 Oct 2021
Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Dawei Liang
Yangyang Shi
Yun Wang
Nayan Singhal
Alex Xiao
Jonathan Shaw
Edison Thomaz
Ozlem Kalinli
M. Seltzer
20
4
0
07 Oct 2021
Previous
1
2
3
...
12
13
14
...
20
21
22
Next