ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1804.05160
  4. Cited By
Exploring the Encoding Layer and Loss Function in End-to-End Speaker and
  Language Recognition System

Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System

14 April 2018
Weicheng Cai
Jinkun Chen
Ming Li
ArXivPDFHTML

Papers citing "Exploring the Encoding Layer and Loss Function in End-to-End Speaker and Language Recognition System"

50 / 146 papers shown
Title
Temporal Attention Pooling for Frequency Dynamic Convolution in Sound Event Detection
Temporal Attention Pooling for Frequency Dynamic Convolution in Sound Event Detection
Hyeonuk Nam
Yong-Hwa Park
33
0
0
17 Apr 2025
JiTTER: Jigsaw Temporal Transformer for Event Reconstruction for Self-Supervised Sound Event Detection
JiTTER: Jigsaw Temporal Transformer for Event Reconstruction for Self-Supervised Sound Event Detection
Hyeonuk Nam
Yong-Hwa Park
48
1
0
28 Feb 2025
Enhancing Risk Assessment in Transformers with Loss-at-Risk Functions
Enhancing Risk Assessment in Transformers with Loss-at-Risk Functions
Jinghan Zhang
Henry Xie
Xinhao Zhang
Kunpeng Liu
51
1
0
04 Nov 2024
Toward Improving Synthetic Audio Spoofing Detection Robustness via
  Meta-Learning and Disentangled Training With Adversarial Examples
Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples
Zhenyu Wang
John H. L. Hansen
AAML
40
1
0
23 Aug 2024
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
Xiaoxiao Miao
Yuxiang Zhang
Xin Wang
N. Tomashenko
D. Soh
Ian Mcloughlin
42
2
0
12 Aug 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep
  Speaker Representation Learning
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
44
4
0
21 Jul 2024
Towards Lightweight Speaker Verification via Adaptive Neural Network
  Quantization
Towards Lightweight Speaker Verification via Adaptive Neural Network Quantization
Bei Liu
Haoyu Wang
Yanmin Qian
MQ
40
1
0
08 Jun 2024
Speaker Characterization by means of Attention Pooling
Speaker Characterization by means of Attention Pooling
Federico Costa
Miquel India
Javier Hernando
33
1
0
07 May 2024
Who is Authentic Speaker
Who is Authentic Speaker
Qiang Huang
23
0
0
30 Apr 2024
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach
Wenbin Wang
Yang Song
Sanjay Jha
42
11
0
28 Apr 2024
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and
  Phoneme Duration for Multi-Speaker Speech Synthesis
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis
Kenichi Fujita
Atsushi Ando
Yusuke Ijima
18
2
0
11 Feb 2024
Self-supervised Reflective Learning through Self-distillation and Online
  Clustering for Speaker Representation Learning
Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
Danwei Cai
Zexin Cai
Ming Li
35
0
0
03 Jan 2024
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for
  Speaker Verification
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification
Hyunjun Heo
U.H Shin
Ran Lee
YoungJu Cheon
Hyung-Min Park
31
9
0
14 Dec 2023
Multi-objective Progressive Clustering for Semi-supervised Domain
  Adaptation in Speaker Verification
Multi-objective Progressive Clustering for Semi-supervised Domain Adaptation in Speaker Verification
Ze Li
Yuke Lin
Ning Jiang
Xiaoyi Qin
Guoqing Zhao
Haiying Wu
Ming Li
VLM
44
1
0
07 Oct 2023
Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Yuke Lin
Xiaoyi Qin
Ning Jiang
Guoqing Zhao
Ming Li
42
3
0
25 Sep 2023
RADIO: Reference-Agnostic Dubbing Video Synthesis
RADIO: Reference-Agnostic Dubbing Video Synthesis
Dongyeun Lee
Chaewon Kim
Sangjoon Yu
Jaejun Yoo
Gyeong-Moon Park
VGen
DiffM
42
1
0
05 Sep 2023
VoxBlink: A Large Scale Speaker Verification Dataset on Camera
VoxBlink: A Large Scale Speaker Verification Dataset on Camera
Yuke Lin
Xiaoyi Qin
Guoqing Zhao
Ming Cheng
Ning Jiang
Haiying Wu
Ming Li
49
14
0
14 Aug 2023
Towards spoken dialect identification of Irish
Towards spoken dialect identification of Irish
Liam Lonergan
Mengjie Qian
Neasa Ní Chiaráin
Christer Gobl
A. N. Chasaide
24
4
0
14 Jul 2023
VIFS: An End-to-End Variational Inference for Foley Sound Synthesis
VIFS: An End-to-End Variational Inference for Foley Sound Synthesis
Junhyeok Lee
Hyeonuk Nam
Yong-Hwa Park
18
4
0
08 Jun 2023
On the Robustness of Arabic Speech Dialect Identification
On the Robustness of Arabic Speech Dialect Identification
Peter Sullivan
AbdelRahim Elmadany
Muhammad Abdul-Mageed
25
8
0
01 Jun 2023
Visualizing data augmentation in deep speaker recognition
Visualizing data augmentation in deep speaker recognition
Pengqi Li
Lantian Li
A. Hamdulla
D. Wang
28
3
0
25 May 2023
Ordered and Binary Speaker Embedding
Ordered and Binary Speaker Embedding
Jiaying Wang
Xianglong Wang
Na Wang
Lantian Li
Dong Wang
23
0
0
25 May 2023
Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain
  Adaptation Speaker Verification
Progressive Sub-Graph Clustering Algorithm for Semi-Supervised Domain Adaptation Speaker Verification
Zhuo Li
Jingze Lu
Z. Zhao
Wenchao Wang
Pengyuan Zhang
32
1
0
22 May 2023
Unsupervised Speech Representation Pooling Using Vector Quantization
Unsupervised Speech Representation Pooling Using Vector Quantization
J. Park
Kwanghee Choi
Hyunjun Heo
Hyung-Min Park
SSL
33
0
0
08 Apr 2023
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice
  Conversion
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
Hyun Joon Park
Seok Woo Yang
Jin Sob Kim
Wooseok Shin
S. W. Han
33
18
0
16 Mar 2023
A Study on Bias and Fairness In Deep Speaker Recognition
A Study on Bias and Fairness In Deep Speaker Recognition
Amirhossein Hajavi
Ali Etemad
27
2
0
14 Mar 2023
Improving Transformer-based Networks With Locality For Automatic Speaker
  Verification
Improving Transformer-based Networks With Locality For Automatic Speaker Verification
Mufan Sang
Yong Zhao
Gang Liu
John H. L. Hansen
Jian Wu
ViT
33
14
0
17 Feb 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
48
654
0
05 Jan 2023
Source Tracing: Detecting Voice Spoofing
Source Tracing: Detecting Voice Spoofing
Tinglong Zhu
Xingming Wang
Xiaoyi Qin
Ming Li
32
12
0
16 Dec 2022
Multi-source Domain Adaptation for Text-independent Forensic Speaker
  Recognition
Multi-source Domain Adaptation for Text-independent Forensic Speaker Recognition
Zhenyu Wang
John H. L. Hansen
36
21
0
17 Nov 2022
Neural Inference of Gaussian Processes for Time Series Data of Quasars
Neural Inference of Gaussian Processes for Time Series Data of Quasars
E. Danilov
A. Ćiprijanović
Brian D. Nord
BDL
AI4TS
14
4
0
17 Nov 2022
Disentangled representation learning for multilingual speaker
  recognition
Disentangled representation learning for multilingual speaker recognition
Kihyun Nam
You-kyong. Kim
Jaesung Huh
Hee-Soo Heo
Jee-weon Jung
Joon Son Chung
53
6
0
01 Nov 2022
Speaker Representation Learning via Contrastive Loss with Maximal
  Speaker Separability
Speaker Representation Learning via Contrastive Loss with Maximal Speaker Separability
Zhe Li
Man-Wai Mak
SSL
26
6
0
29 Oct 2022
A Compact End-to-End Model with Local and Global Context for Spoken
  Language Identification
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification
Fei Jia
Nithin Rao Koluguri
Jagadeesh Balam
Boris Ginsburg
33
3
0
27 Oct 2022
Large-scale learning of generalised representations for speaker
  recognition
Large-scale learning of generalised representations for speaker recognition
Jee-weon Jung
Hee-Soo Heo
Bong-Jin Lee
Jaesong Lee
Hye-jin Shim
Youngki Kwon
Joon Son Chung
Shinji Watanabe
CVBM
36
6
0
20 Oct 2022
Deepfake audio detection by speaker verification
Deepfake audio detection by speaker verification
Alessandro Pianese
D. Cozzolino
Giovanni Poggi
L. Verdoliva
38
39
0
28 Sep 2022
Attention and DCT based Global Context Modeling for Text-independent
  Speaker Recognition
Attention and DCT based Global Context Modeling for Text-independent Speaker Recognition
Wei Xia
John H. L. Hansen
32
4
0
04 Aug 2022
Cross-Age Speaker Verification: Learning Age-Invariant Speaker
  Embeddings
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings
Xiaoyi Qin
Na Li
Chao Weng
Dan Su
Ming Li
61
16
0
13 Jul 2022
Label-Efficient Self-Supervised Speaker Verification With Information
  Maximization and Contrastive Learning
Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning
Théo Lepage
Réda Dehak
SSL
29
12
0
12 Jul 2022
Multi-Frequency Information Enhanced Channel Attention Module for
  Speaker Representation Learning
Multi-Frequency Information Enhanced Channel Attention Module for Speaker Representation Learning
Mufan Sang
John H. L. Hansen
22
13
0
10 Jul 2022
Transport-Oriented Feature Aggregation for Speaker Embedding Learning
Transport-Oriented Feature Aggregation for Speaker Embedding Learning
Yusheng Tian
Jingyu Li
Tan Lee
13
1
0
26 Jun 2022
The SJTU X-LANCE Lab System for CNSRC 2022
The SJTU X-LANCE Lab System for CNSRC 2022
Zhengyang Chen
Bei Liu
Bing Han
Leying Zhang
Y. Qian
20
19
0
23 Jun 2022
Identifying Source Speakers for Voice Conversion based Spoofing Attacks
  on Speaker Verification Systems
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems
Danwei Cai
Zexin Cai
Ming Li
23
10
0
18 Jun 2022
Bi-LSTM Scoring Based Similarity Measurement with Agglomerative
  Hierarchical Clustering (AHC) for Speaker Diarization
Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization
S. S. Nijhawan
Homayoon Beigi
11
0
0
19 May 2022
Pretraining Approaches for Spoken Language Recognition: TalTech
  Submission to the OLR 2021 Challenge
Pretraining Approaches for Spoken Language Recognition: TalTech Submission to the OLR 2021 Challenge
Tanel Alumäe
Kunnar Kukk
15
5
0
14 May 2022
Back-ends Selection for Deep Speaker Embeddings
Back-ends Selection for Deep Speaker Embeddings
Zhuo Li
Runqiu Xiao
Zi-qiang Zhang
Zhenduo Zhao
Wenchao Wang
Pengyuan Zhang
19
0
0
25 Apr 2022
Target Confusion in End-to-end Speaker Extraction: Analysis and
  Approaches
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches
Zifeng Zhao
Dongchao Yang
Rongzhi Gu
Haoran Zhang
Yuexian Zou
28
16
0
04 Apr 2022
Generation of Speaker Representations Using Heterogeneous Training Batch
  Assembly
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly
Yu-Huai Peng
Hung-Shin Lee
Pin-Tuan Huang
Hsin-Min Wang
21
0
0
30 Mar 2022
Combination of Time-domain, Frequency-domain, and Cepstral-domain
  Acoustic Features for Speech Commands Classification
Combination of Time-domain, Frequency-domain, and Cepstral-domain Acoustic Features for Speech Commands Classification
Yikang Wang
Hiromitsu Nishizaki
36
1
0
30 Mar 2022
Towards a Common Speech Analysis Engine
Towards a Common Speech Analysis Engine
Hagai Aronowitz
Itai Gat
E. Morais
Weizhong Zhu
R. Hoory
26
3
0
01 Mar 2022
123
Next