Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1706.08612
Cited By
v1
v2 (latest)
VoxCeleb: a large-scale speaker identification dataset
26 June 2017
Arsha Nagrani
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"VoxCeleb: a large-scale speaker identification dataset"
50 / 1,111 papers shown
Title
Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
J. Yang
Lei He
90
11
0
20 Jan 2022
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge
A. Brown
Jaesung Huh
Joon Son Chung
Arsha Nagrani
Daniel Garcia-Romero
Andrew Zisserman
76
40
0
12 Jan 2022
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound
Rowan Zellers
Jiasen Lu
Ximing Lu
Youngjae Yu
Yanpeng Zhao
Mohammadreza Salehi
Aditya Kusupati
Jack Hessel
Ali Farhadi
Yejin Choi
115
215
0
07 Jan 2022
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
198
51
0
27 Dec 2021
Graph attentive feature aggregation for text-independent speaker verification
Hye-jin Shim
Ju-Sung Heo
Jae-han Park
Gareth Lee
Ha-Jin Yu
105
16
0
23 Dec 2021
Fusion and Orthogonal Projection for Improved Face-Voice Association
Muhammad Saeed
M. H. Khan
Shah Nawaz
Muhammad Haroon Yousaf
Alessio Del Bue
CVBM
132
28
0
20 Dec 2021
Bootstrap Equilibrium and Probabilistic Speaker Representation Learning for Self-supervised Speaker Verification
Sung Hwan Mun
Min Hyun Han
Dongjune Lee
Jihwan Kim
N. Kim
SSL
96
3
0
16 Dec 2021
End-to-end speaker diarization with transformer
Yongquan Lai
Xin Tang
Yuanyuan Fu
Rui Fang
51
1
0
14 Dec 2021
Explore Long-Range Context feature for Speaker Verification
Zhuo Li
69
6
0
14 Dec 2021
Smooth-Swap: A Simple Enhancement for Face-Swapping with Smoothness
Jiseob Kim
Ji-Hyun Lee
Byoung-Tak Zhang
CVBM
66
45
0
11 Dec 2021
X-Vector based voice activity detection for multi-genre broadcast speech-to-text
Misa Ogura
Matt Haynes
43
0
0
09 Dec 2021
Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization
Mufan Sang
Haoqi Li
Fan Liu
Andrew O. Arnold
Li Wan
SSL
94
41
0
08 Dec 2021
Audio-Visual Synchronisation in the wild
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
121
40
0
08 Dec 2021
Robust Speech Representation Learning via Flow-based Embedding Regularization
Woohyun Kang
Jahangir Alam
A. Fathan
62
3
0
07 Dec 2021
One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning
Suzhe Wang
Lincheng Li
Yueqing Ding
Xin Yu
CVBM
128
119
0
06 Dec 2021
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
246
415
0
04 Dec 2021
AVA-AVD: Audio-Visual Speaker Diarization in the Wild
Eric Z. Xu
Zeyang Song
Satoshi Tsutsui
C. Feng
Mang Ye
Mike Zheng Shou
VGen
83
43
0
29 Nov 2021
Low-Latency Online Speaker Diarization with Graph-Based Label Generation
Yucong Zhang
Qinjian Lin
Weiqing Wang
Lin Yang
Xuyang Wang
Junjie Wang
Ming Li
62
10
0
27 Nov 2021
An MAP Estimation for Between-Class Variance
Jiao Han
Yunqi Cai
Lantian Li
Guanyu Li
Dong Wang
28
0
0
24 Nov 2021
A Study on Decoupled Probabilistic Linear Discriminant Analysis
Ding Wang
Lantian Li
Hongzhi Yu
Dong Wang
38
0
0
24 Nov 2021
Towards Learning Universal Audio Representations
Luyu Wang
Pauline Luc
Yan Wu
Adrià Recasens
Lucas Smaira
...
Andrew Jaegle
Jean-Baptiste Alayrac
Sander Dieleman
João Carreira
Aaron van den Oord
SSL
128
71
0
23 Nov 2021
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech
Suwon Shon
Ankita Pasad
Felix Wu
Pablo Brusco
Yoav Artzi
Karen Livescu
Kyu Jeong Han
AuLLM
ELM
106
76
0
19 Nov 2021
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
Arun Babu
Changhan Wang
Andros Tjandra
Kushal Lakhotia
Qiantong Xu
...
Yatharth Saraf
J. Pino
Alexei Baevski
Alexis Conneau
Michael Auli
SSL
114
713
0
17 Nov 2021
Unsupervised Speech Enhancement with speech recognition embedding and disentanglement losses
V. Trinh
Sebastian Braun
62
19
0
16 Nov 2021
AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment
Kangyeol Kim
S. Park
Jaeseong Lee
Sunghyo Chung
Junsoo Lee
Jaegul Choo
3DH
CVBM
105
15
0
15 Nov 2021
MultiSV: Dataset for Far-Field Multi-Channel Speaker Verification
Ladislav Mošner
Oldrich Plchot
L. Burget
J. Černocký
63
7
0
11 Nov 2021
Inclusive Speaker Verification with Adaptive thresholding
Navdeep Jain
Hongcheng Wang
50
0
0
10 Nov 2021
SAFA: Structure Aware Face Animation
Qiulin Wang
Luankun Zhang
Bo Li
3DH
CVBM
90
21
0
09 Nov 2021
Characterizing the adversarial vulnerability of speech self-supervised learning
Haibin Wu
Bo Zheng
Xu Li
Xixin Wu
Hung-yi Lee
Helen Meng
AAML
SSL
173
7
0
08 Nov 2021
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech
Sung-Feng Huang
Chyi-Jiunn Lin
Da-Rong Liu
Yi-Chen Chen
Hung-yi Lee
126
57
0
07 Nov 2021
Emotional Prosody Control for Speech Generation
S. Sivaprasad
Saiteja Kosgi
Vineet Gandhi
63
17
0
07 Nov 2021
Class Token and Knowledge Distillation for Multi-head Self-Attention Speaker Verification Systems
Victoria Mingote
A. Miguel
A. O. Giménez
EDUARDO LLEIDA SOLANO
62
10
0
06 Nov 2021
Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification
J. Málek
Jakub Janský
Zbyněk Koldovský
Tomás Kounovský
Jaroslav Cmejla
J. Zdánský
50
10
0
05 Nov 2021
A Fine-tuned Wav2vec 2.0/HuBERT Benchmark For Speech Emotion Recognition, Speaker Verification and Spoken Language Understanding
Yingzhi Wang
Abdelmoumene Boumadane
A. Heba
100
153
0
04 Nov 2021
Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity
Peter Wu
Jiatong Shi
Yifan Zhong
Shinji Watanabe
A. Black
66
8
0
02 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
296
1,913
0
26 Oct 2021
CS-Rep: Making Speaker Verification Networks Embracing Re-parameterization
Ruiteng Zhang
Jianguo Wei
Wenhuan Lu
Lin Zhang
Y. Ji
Junhai Xu
Xugang Lu
59
10
0
26 Oct 2021
Contrastive Neural Processes for Self-Supervised Learning
Konstantinos Kallidromitis
Denis A. Gudovskiy
Kozuka Kazuki
Ohama Iku
Luca Rigazio
SSL
AI4TS
100
10
0
24 Oct 2021
A Study of Multimodal Person Verification Using Audio-Visual-Thermal Data
Madina Abdrakhmanova
Siwen Guo
Yerbolat Khassanov
Shohreh Haddadan
42
5
0
23 Oct 2021
Optimizing Multi-Taper Features for Deep Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
71
1
0
21 Oct 2021
Rep Works in Speaker Verification
Yufeng Ma
Miao Zhao
Yiwei Ding
Yu Zheng
Min Liu
Minqiang Xu
82
8
0
19 Oct 2021
Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information
Baolin Zheng
Peipei Jiang
Qian Wang
Qi Li
Chao Shen
Cong Wang
Yunjie Ge
Qingyang Teng
Shenyi Zhang
AAML
44
73
0
19 Oct 2021
Tackling the Score Shift in Cross-Lingual Speaker Verification by Exploiting Language Information
Jenthe Thienpondt
Brecht Desplanques
Kris Demuynck
40
9
0
18 Oct 2021
DECAR: Deep Clustering for learning general-purpose Audio Representations
Sreyan Ghosh
Sandesh V Katta
Ashish Seth
S. Umesh
SSL
76
12
0
17 Oct 2021
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
162
202
0
14 Oct 2021
The Impact of Spatiotemporal Augmentations on Self-Supervised Audiovisual Representation Learning
Haider Al-Tahan
Y. Mohsenzadeh
SSL
AI4TS
49
0
0
13 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
428
1,115
0
13 Oct 2021
Duality Temporal-channel-frequency Attention Enhanced Speaker Representation Learning
Li Zhang
Qing Wang
Lei Xie
114
17
0
13 Oct 2021
Simple Attention Module based Speaker Verification with Iterative noisy label detection
Xiaoyi Qin
Na Li
Chao Weng
Jane Polak Scowcroft
Ming Li
NoLa
87
52
0
13 Oct 2021
Adapting TTS models For New Speakers using Transfer Learning
Paarth Neekhara
Jason Chun Lok Li
Boris Ginsburg
144
15
0
12 Oct 2021
Previous
1
2
3
...
12
13
14
...
21
22
23
Next