ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1703.06284
  4. Cited By
Multi-talker Speech Separation with Utterance-level Permutation
  Invariant Training of Deep Recurrent Neural Networks

Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks

18 March 2017
Morten Kolbaek
Dong Yu
Z. Tan
Jesper Jensen
ArXivPDFHTML

Papers citing "Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks"

50 / 96 papers shown
Title
Listen to Extract: Onset-Prompted Target Speaker Extraction
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Z. Wang
48
0
0
08 May 2025
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
David Perera
Victor Letzelter
Théo Mariotte
Adrien Cortés
Mickaël Chen
S. Essid
Ga¨el Richard
69
2
0
20 Jan 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
43
0
0
13 Jan 2025
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Jiawen Kang
Lingwei Meng
Mingyu Cui
Yuejiao Wang
Xixin Wu
Xunying Liu
Helen Meng
41
1
0
19 Sep 2024
Serialized Speech Information Guidance with Overlapped Encoding
  Separation for Multi-Speaker Automatic Speech Recognition
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Hao Shi
Yuan Gao
Zhaoheng Ni
Tatsuya Kawahara
30
2
0
01 Sep 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by
  Magnitude Conditioning
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by Magnitude Conditioning
Kuan-Hsun Ho
J. Hung
Berlin Chen
34
0
0
04 Mar 2024
End-to-end Online Speaker Diarization with Target Speaker Tracking
End-to-end Online Speaker Diarization with Target Speaker Tracking
Weiqing Wang
Ming Li
28
5
0
12 Oct 2023
On Time Domain Conformer Models for Monaural Speech Separation in Noisy
  Reverberant Acoustic Environments
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
William Ravenscroft
Stefan Goetze
Thomas Hain
28
7
0
09 Oct 2023
Conformer-based Target-Speaker Automatic Speech Recognition for
  Single-Channel Audio
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Yang Zhang
Krishna C. Puvvada
Vitaly Lavrukhin
Boris Ginsburg
27
14
0
09 Aug 2023
Mixture Encoder for Joint Speech Separation and Recognition
Mixture Encoder for Joint Speech Separation and Recognition
Simon Berger
Peter Vieting
Christoph Boeddeker
Ralf Schluter
Reinhold Häb-Umbach
16
6
0
21 Jun 2023
An Efficient Speech Separation Network Based on Recurrent Fusion Dilated
  Convolution and Channel Attention
An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention
Junyu Wang
22
1
0
09 Jun 2023
An Experimental Review of Speaker Diarization methods with application
  to Two-Speaker Conversational Telephone Speech recordings
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
L. Serafini
Samuele Cornell
Giovanni Morrone
Enrico Zovato
A. Brutti
S. Squartini
37
9
0
29 May 2023
A Neural State-Space Model Approach to Efficient Speech Separation
A Neural State-Space Model Approach to Efficient Speech Separation
Chen Chen
Chao-Han Huck Yang
Kai Li
Yuchen Hu
Pin-Jui Ku
Chng Eng Siong
26
11
0
26 May 2023
Multi-channel Speech Separation Using Spatially Selective Deep
  Non-linear Filters
Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters
Kristina Tesch
Timo Gerkmann
24
16
0
24 Apr 2023
Beamformer-Guided Target Speaker Extraction
Beamformer-Guided Target Speaker Extraction
Mohamed Elminshawi
Srikanth Raj Chetupalli
Emanuel Habets
19
7
0
15 Mar 2023
Multi-Scale Feature Fusion Transformer Network for End-to-End Single
  Channel Speech Separation
Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Yinhao Xu
Jian Zhou
L. Tao
H. Kwan
24
0
0
14 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of
  the art analysis
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
28
21
0
01 Dec 2022
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech
  Separation
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation
Zhongqiu Wang
Samuele Cornell
Shukjae Choi
Younglo Lee
Byeonghak Kim
Shinji Watanabe
24
119
0
22 Nov 2022
Speaker Overlap-aware Neural Diarization for Multi-party Meeting
  Analysis
Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis
Zhihao Du
Shiliang Zhang
Siqi Zheng
Zhijie Yan
21
14
0
18 Nov 2022
Diffusion-based Generative Speech Source Separation
Diffusion-based Generative Speech Source Separation
Robin Scheibler
Youna Ji
Soo-Whan Chung
J. Byun
Soyeon Choe
Min-Seok Choi
DiffM
21
38
0
31 Oct 2022
DiaCorrect: End-to-end error correction for speaker diarization
Jiangyu Han
Yuhang Cao
Heng Lu
Yanhua Long
37
0
0
31 Oct 2022
Deformable Temporal Convolutional Networks for Monaural Noisy
  Reverberant Speech Separation
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation
William Ravenscroft
Stefan Goetze
Thomas Hain
25
11
0
27 Oct 2022
Audio Signal Enhancement with Learning from Positive and Unlabelled Data
Audio Signal Enhancement with Learning from Positive and Unlabelled Data
N. Ito
Masashi Sugiyama
19
7
0
27 Oct 2022
Position tracking of a varying number of sound sources with sliding
  permutation invariant training
Position tracking of a varying number of sound sources with sliding permutation invariant training
David Diaz-Guerra
A. Politis
Tuomas Virtanen
17
5
0
26 Oct 2022
Utterance-by-utterance overlap-aware neural diarization with Graph-PIT
Utterance-by-utterance overlap-aware neural diarization with Graph-PIT
K. Kinoshita
Thilo von Neumann
Marc Delcroix
Christoph Boeddeker
Reinhold Haeb-Umbach
38
4
0
28 Jul 2022
Heterogeneous Separation Consistency Training for Adaptation of
  Unsupervised Speech Separation
Heterogeneous Separation Consistency Training for Adaptation of Unsupervised Speech Separation
Jiangyu Han
Yanhua Long
20
6
0
23 Apr 2022
SoundBeam: Target sound extraction conditioned on sound-class labels and
  enrollment clues for increased performance and continuous learning
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
Yasunori Ohishi
S. Araki
VLM
22
31
0
08 Apr 2022
Target Confusion in End-to-end Speaker Extraction: Analysis and
  Approaches
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches
Zifeng Zhao
Dongchao Yang
Rongzhi Gu
Haoran Zhang
Yuexian Zou
23
15
0
04 Apr 2022
End-to-End Multi-speaker ASR with Independent Vector Analysis
End-to-End Multi-speaker ASR with Independent Vector Analysis
Robin Scheibler
Wangyou Zhang
Xuankai Chang
Shinji Watanabe
Y. Qian
8
2
0
01 Apr 2022
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain
  Target Speaker Extraction
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Zexu Pan
Meng Ge
Haizhou Li
18
17
0
31 Mar 2022
Speaker Extraction with Co-Speech Gestures Cue
Speaker Extraction with Co-Speech Gestures Cue
Zexu Pan
Xinyuan Qian
Haizhou Li
SLR
21
26
0
31 Mar 2022
Effective data screening technique for crowdsourced speech
  intelligibility experiments: Evaluation with IRM-based speech enhancement
Effective data screening technique for crowdsourced speech intelligibility experiments: Evaluation with IRM-based speech enhancement
Ayako Yamamoto
Toshio Irino
S. Araki
Kenichi Arai
A. Ogawa
K. Kinoshita
Tomohiro Nakatani
9
2
0
31 Mar 2022
Single microphone speaker extraction using unified time-frequency
  Siamese-Unet
Single microphone speaker extraction using unified time-frequency Siamese-Unet
Aviad Eisenberg
Sharon Gannot
Shlomo E. Chazan
22
3
0
06 Mar 2022
Audio-visual speech separation based on joint feature representation
  with cross-modal attention
Audio-visual speech separation based on joint feature representation with cross-modal attention
Jun Xiong
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
Yanni Zhang
20
3
0
05 Mar 2022
Tight integration of neural- and clustering-based diarization through
  deep unfolding of infinite Gaussian mixture model
Tight integration of neural- and clustering-based diarization through deep unfolding of infinite Gaussian mixture model
K. Kinoshita
Marc Delcroix
Tomoharu Iwata
BDL
15
19
0
14 Feb 2022
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation
  Invariant Training
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training
Ertuğ Karamatlı
S. Kırbız
SSL
30
9
0
08 Feb 2022
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech
  Separation
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation
Chenda Li
Lei Yang
Weiqin Wang
Y. Qian
24
24
0
26 Jan 2022
SA-SDR: A novel loss function for separation of meeting style data
SA-SDR: A novel loss function for separation of meeting style data
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
24
20
0
29 Oct 2021
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same
  Class with Auxiliary Duplicating Permutation Invariant Training
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training
Kazuki Shimada
Yuichiro Koyama
Shusuke Takahashi
Naoya Takahashi
E. Tsunoo
Yuki Mitsufuji
13
63
0
14 Oct 2021
SDR -- Medium Rare with Fast Computations
SDR -- Medium Rare with Fast Computations
Robin Scheibler
18
17
0
13 Oct 2021
Multi-channel Narrow-band Deep Speech Separation with Full-band
  Permutation Invariant Training
Multi-channel Narrow-band Deep Speech Separation with Full-band Permutation Invariant Training
Changsheng Quan
Xiaofei Li
30
14
0
12 Oct 2021
VarArray: Array-Geometry-Agnostic Continuous Speech Separation
VarArray: Array-Geometry-Agnostic Continuous Speech Separation
Takuya Yoshioka
Xiaofei Wang
Dongmei Wang
M. Tang
Zirun Zhu
Zhuo Chen
Naoyuki Kanda
15
37
0
12 Oct 2021
Location-based training for multi-channel talker-independent speaker
  separation
Location-based training for multi-channel talker-independent speaker separation
H. Taherian
Ke Tan
DeLiang Wang
11
10
0
08 Oct 2021
A Novel Blind Source Separation Framework Towards Maximum
  Signal-To-Interference Ratio
A Novel Blind Source Separation Framework Towards Maximum Signal-To-Interference Ratio
Jianjun Gu
Longbiao Cheng
Dingding Yao
Junfeng Li
Yonghong Yan
11
0
0
07 Oct 2021
USEV: Universal Speaker Extraction with Visual Cue
USEV: Universal Speaker Extraction with Visual Cue
Zexu Pan
Meng Ge
Haizhou Li
31
41
0
30 Sep 2021
Graph-PIT: Generalized permutation invariant training for continuous
  separation of arbitrary numbers of speakers
Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers
Thilo von Neumann
K. Kinoshita
Christoph Boeddeker
Marc Delcroix
Reinhold Haeb-Umbach
8
23
0
30 Jul 2021
Speeding Up Permutation Invariant Training for Source Separation
Speeding Up Permutation Invariant Training for Source Separation
Thilo von Neumann
Christoph Boeddeker
K. Kinoshita
Marc Delcroix
Reinhold Haeb-Umbach
14
6
0
30 Jul 2021
Multi-Task Audio Source Separation
Multi-Task Audio Source Separation
Lu Zhang
Chenxing Li
Feng Deng
Xiaorui Wang
38
8
0
14 Jul 2021
Lightweight Dual-channel Target Speaker Separation for Mobile Voice
  Communication
Lightweight Dual-channel Target Speaker Separation for Mobile Voice Communication
Yuanyuan Bao
Yanze Xu
Na Xu
Wenjing Yang
Hongfeng Li
Shicong Li
Y. Jia
Fei Xiang
Jincheng He
Ming Li
22
1
0
05 Jun 2021
12
Next