Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1804.05160
Cited By
Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System
14 April 2018
Weicheng Cai
Jinkun Chen
Ming Li
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System"
50 / 146 papers shown
Title
Temporal Attention Pooling for Frequency Dynamic Convolution in Sound Event Detection
Hyeonuk Nam
Yong-Hwa Park
33
0
0
17 Apr 2025
JiTTER: Jigsaw Temporal Transformer for Event Reconstruction for Self-Supervised Sound Event Detection
Hyeonuk Nam
Yong-Hwa Park
48
1
0
28 Feb 2025
Enhancing Risk Assessment in Transformers with Loss-at-Risk Functions
Jinghan Zhang
Henry Xie
Xinhao Zhang
Kunpeng Liu
51
1
0
04 Nov 2024
Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples
Zhenyu Wang
John H. L. Hansen
AAML
40
1
0
23 Aug 2024
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Xiaoxiao Miao
Yuxiang Zhang
Xin Wang
N. Tomashenko
D. Soh
Ian Mcloughlin
42
2
0
12 Aug 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
44
4
0
21 Jul 2024
Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization
Bei Liu
Haoyu Wang
Yanmin Qian
MQ
40
1
0
08 Jun 2024
Speaker Characterization by means of Attention Pooling
Federico Costa
Miquel India
Javier Hernando
33
1
0
07 May 2024
Who is Authentic Speaker
Qiang Huang
23
0
0
30 Apr 2024
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach
Wenbin Wang
Yang Song
Sanjay Jha
42
11
0
28 Apr 2024
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis
Kenichi Fujita
Atsushi Ando
Yusuke Ijima
18
2
0
11 Feb 2024
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
Danwei Cai
Zexin Cai
Ming Li
35
0
0
03 Jan 2024
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification
Hyunjun Heo
U.H Shin
Ran Lee
YoungJu Cheon
Hyung-Min Park
31
9
0
14 Dec 2023
Multi-objective Progressive Clustering for Semi-supervised Domain Adaptation in Speaker Verification
Ze Li
Yuke Lin
Ning Jiang
Xiaoyi Qin
Guoqing Zhao
Haiying Wu
Ming Li
VLM
44
1
0
07 Oct 2023
Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Yuke Lin
Xiaoyi Qin
Ning Jiang
Guoqing Zhao
Ming Li
42
3
0
25 Sep 2023
RADIO: Reference-Agnostic Dubbing Video Synthesis
Dongyeun Lee
Chaewon Kim
Sangjoon Yu
Jaejun Yoo
Gyeong-Moon Park
VGen
DiffM
42
1
0
05 Sep 2023
VoxBlink: A Large Scale Speaker Verification Dataset on Camera
Yuke Lin
Xiaoyi Qin
Guoqing Zhao
Ming Cheng
Ning Jiang
Haiying Wu
Ming Li
49
14
0
14 Aug 2023
Towards spoken dialect identification of Irish
Liam Lonergan
Mengjie Qian
Neasa Ní Chiaráin
Christer Gobl
A. N. Chasaide
24
4
0
14 Jul 2023
VIFS: An End-to-End Variational Inference for Foley Sound Synthesis
Junhyeok Lee
Hyeonuk Nam
Yong-Hwa Park
18
4
0
08 Jun 2023
On the Robustness of Arabic Speech Dialect Identification
Peter Sullivan
AbdelRahim Elmadany
Muhammad Abdul-Mageed
25
8
0
01 Jun 2023
Visualizing data augmentation in deep speaker recognition
Pengqi Li
Lantian Li
A. Hamdulla
D. Wang
28
3
0
25 May 2023
Ordered and Binary Speaker Embedding
Jiaying Wang
Xianglong Wang
Na Wang
Lantian Li
Dong Wang
23
0
0
25 May 2023
Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain Adaptation Speaker Verification
Zhuo Li
Jingze Lu
Z. Zhao
Wenchao Wang
Pengyuan Zhang
32
1
0
22 May 2023
Unsupervised Speech Representation Pooling Using Vector Quantization
J. Park
Kwanghee Choi
Hyunjun Heo
Hyung-Min Park
SSL
33
0
0
08 Apr 2023
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
Hyun Joon Park
Seok Woo Yang
Jin Sob Kim
Wooseok Shin
S. W. Han
33
18
0
16 Mar 2023
A Study on Bias and Fairness In Deep Speaker Recognition
Amirhossein Hajavi
Ali Etemad
27
2
0
14 Mar 2023
Improving Transformer-based Networks With Locality For Automatic Speaker Verification
Mufan Sang
Yong Zhao
Gang Liu
John H. L. Hansen
Jian Wu
ViT
33
14
0
17 Feb 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
48
654
0
05 Jan 2023
Source Tracing: Detecting Voice Spoofing
Tinglong Zhu
Xingming Wang
Xiaoyi Qin
Ming Li
32
12
0
16 Dec 2022
Multi-source Domain Adaptation for Text-independent Forensic Speaker Recognition
Zhenyu Wang
John H. L. Hansen
36
21
0
17 Nov 2022
Neural Inference of Gaussian Processes for Time Series Data of Quasars
E. Danilov
A. Ćiprijanović
Brian D. Nord
BDL
AI4TS
14
4
0
17 Nov 2022
Disentangled representation learning for multilingual speaker recognition
Kihyun Nam
You-kyong. Kim
Jaesung Huh
Hee-Soo Heo
Jee-weon Jung
Joon Son Chung
53
6
0
01 Nov 2022
Speaker Representation Learning via Contrastive Loss with Maximal Speaker Separability
Zhe Li
Man-Wai Mak
SSL
26
6
0
29 Oct 2022
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification
Fei Jia
Nithin Rao Koluguri
Jagadeesh Balam
Boris Ginsburg
33
3
0
27 Oct 2022
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
36
6
0
20 Oct 2022
Deepfake audio detection by speaker verification
Alessandro Pianese
D. Cozzolino
Giovanni Poggi
L. Verdoliva
38
39
0
28 Sep 2022
Attention and DCT based Global Context Modeling for Text-independent Speaker Recognition
Wei Xia
John H. L. Hansen
32
4
0
04 Aug 2022
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings
Xiaoyi Qin
Na Li
Chao Weng
Dan Su
Ming Li
61
16
0
13 Jul 2022
Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning
Théo Lepage
Réda Dehak
SSL
29
12
0
12 Jul 2022
Multi-Frequency Information Enhanced Channel Attention Module for Speaker Representation Learning
Mufan Sang
John H. L. Hansen
22
13
0
10 Jul 2022
Transport-Oriented Feature Aggregation for Speaker Embedding Learning
Yusheng Tian
Jingyu Li
Tan Lee
13
1
0
26 Jun 2022
The SJTU X-LANCE Lab System for CNSRC 2022
Zhengyang Chen
Bei Liu
Bing Han
Leying Zhang
Y. Qian
20
19
0
23 Jun 2022
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems
Danwei Cai
Zexin Cai
Ming Li
23
10
0
18 Jun 2022
Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization
S. S. Nijhawan
Homayoon Beigi
11
0
0
19 May 2022
Pretraining Approaches for Spoken Language Recognition: TalTech Submission to the OLR 2021 Challenge
Tanel Alumäe
Kunnar Kukk
15
5
0
14 May 2022
Back-ends Selection for Deep Speaker Embeddings
Zhuo Li
Runqiu Xiao
Zi-qiang Zhang
Zhenduo Zhao
Wenchao Wang
Pengyuan Zhang
19
0
0
25 Apr 2022
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches
Zifeng Zhao
Dongchao Yang
Rongzhi Gu
Haoran Zhang
Yuexian Zou
28
16
0
04 Apr 2022
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly
Yu-Huai Peng
Hung-Shin Lee
Pin-Tuan Huang
Hsin-Min Wang
21
0
0
30 Mar 2022
Combination of Time-domain, Frequency-domain, and Cepstral-domain Acoustic Features for Speech Commands Classification
Yikang Wang
Hiromitsu Nishizaki
36
1
0
30 Mar 2022
Towards a Common Speech Analysis Engine
Hagai Aronowitz
Itai Gat
E. Morais
Weizhong Zhu
R. Hoory
26
3
0
01 Mar 2022
1
2
3
Next