Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1607.00325
Cited By
Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation
1 July 2016
Dong Yu
Morten Kolbæk
Zheng-Hua Tan
Jesper Jensen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation"
50 / 155 papers shown
Title
ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior
Zhongweiyang Xu
Xulin Fan
Zhong-Qiu Wang
Xilin Jiang
Romit Roy Choudhury
DiffM
54
0
0
08 May 2025
A Comparative Study on Positional Encoding for Time-frequency Domain Dual-path Transformer-based Source Separation Models
Kohei Saijo
Tetsuji Ogawa
52
1
0
28 Apr 2025
Location-Oriented Sound Event Localization and Detection with Spatial Mapping and Regression Localization
Xueping Zhang
Yaxiong Chen
Ruilin Yao
Yunfei Zi
Shengwu Xiong
38
0
0
11 Apr 2025
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
43
6
0
17 Jan 2025
Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask Estimation
Jakob Kienegger
Alina Mannanova
Timo Gerkmann
46
0
0
10 Jan 2025
Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
Akam Rahimi
Triantafyllos Afouras
Andrew Zisserman
42
28
0
02 Jan 2025
SoundLoc3D: Invisible 3D Sound Source Localization and Classification Using a Multimodal RGB-D Acoustic Camera
Yuhang He
Sangyun Shin
Anoop Cherian
Niki Trigoni
Andrew Markham
88
0
0
31 Dec 2024
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
Jiawen Kang
Lingwei Meng
Mingyu Cui
Yuejiao Wang
Xixin Wu
Xunying Liu
Helen Meng
41
2
0
19 Sep 2024
Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions
Lingwei Meng
Shujie Hu
Jiawen Kang
Zhaoqing Li
Yuejiao Wang
Wenxuan Wu
Xixin Wu
Xunying Liu
Helen Meng
AuLLM
72
2
0
13 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
Bang Zeng
Ming Li
39
2
0
04 Sep 2024
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Hao Shi
Yuan Gao
Zhaoheng Ni
Tatsuya Kawahara
34
2
0
01 Sep 2024
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Weakly-supervised Audio Separation via Bi-modal Semantic Similarity
Tanvir Mahmud
Saeed Amizadeh
K. Koishida
Diana Marculescu
AI4TS
16
2
0
02 Apr 2024
Speech-Aware Neural Diarization with Encoder-Decoder Attractor Guided by Attention Constraints
PeiYing Lee
HauYun Guo
Berlin Chen
34
0
0
21 Mar 2024
Single-Channel Robot Ego-Speech Filtering during Human-Robot Interaction
Yue Li
Koen V. Hindriks
Florian A. Kunneman
35
2
0
05 Mar 2024
Sound Source Separation Using Latent Variational Block-Wise Disentanglement
Karim Helwani
M. Togami
Paris Smaragdis
Michael M. Goodwin
BDL
DRL
26
1
0
08 Feb 2024
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion
Samuel Pegg
Kai Li
Xiaolin Hu
32
1
0
25 Jan 2024
Self-Supervised Music Source Separation Using Vector-Quantized Source Category Estimates
Marco Pasini
Stefan Lattner
George Fazekas
35
1
0
21 Nov 2023
Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors
Di Liang
Nian Shao
Xiaofei Li
33
4
0
25 Sep 2023
Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding Enhancer
Zhengyang Chen
Bing Han
Shuai Wang
Yan-min Qian
28
18
0
13 Sep 2023
The Sound Demixing Challenge 2023
\unicode
x
2013
\unicode{x2013}
\unicode
x
2013
Cinematic Demixing Track
Stefan Uhlich
Giorgio Fabbro
M. Hirano
Shusuke Takahashi
Gordon Wichern
...
R. Solovyev
A. Stempkovskiy
T. Habruseva
M. Sukhovei
Yuki Mitsufuji
50
11
0
14 Aug 2023
Complete and separate: Conditional separation with missing target source attribute completion
Dimitrios Bralios
Efthymios Tzinis
Paris Smaragdis
35
0
0
27 Jul 2023
AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Jiuxin Lin
X. Cai
Heinrich Dinkel
Jun Chen
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Zhiyong Wu
Yujun Wang
Helen M. Meng
22
21
0
25 Jun 2023
Mixture Encoder for Joint Speech Separation and Recognition
Simon Berger
Peter Vieting
Christoph Boeddeker
Ralf Schluter
Reinhold Häb-Umbach
24
6
0
21 Jun 2023
An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention
Junyu Wang
22
1
0
09 Jun 2023
Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator
Lingwei Meng
Jiawen Kang
Mingyu Cui
Haibin Wu
Xixin Wu
Helen M. Meng
39
10
0
25 May 2023
BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR
Yuhao Liang
Fan Yu
Yangze Li
Pengcheng Guo
Shiliang Zhang
Qian Chen
Linfu Xie
30
8
0
23 May 2023
Unsupervised Multi-channel Separation and Adaptation
Cong Han
K. Wilson
Scott Wisdom
J. Hershey
26
4
0
18 May 2023
Neural Diarization with Non-autoregressive Intermediate Attractors
Yusuke Fujita
Tatsuya Komatsu
Robin Scheibler
Yusuke Kida
Tetsuji Ogawa
40
11
0
13 Mar 2023
Neural Target Speech Extraction: An Overview
Kateřina Žmolíková
Marc Delcroix
Tsubasa Ochiai
K. Kinoshita
JanHonza'' vCernocký
Dong Yu
23
86
0
31 Jan 2023
Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Yinhao Xu
Jian Zhou
L. Tao
H. Kwan
30
0
0
14 Dec 2022
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos
Hao-Wen Dong
Naoya Takahashi
Yuki Mitsufuji
Julian McAuley
Taylor Berg-Kirkpatrick
VLM
CLIP
31
25
0
14 Dec 2022
iQuery: Instruments as Queries for Audio-Visual Sound Separation
Jiaben Chen
Renrui Zhang
Dongze Lian
Jiaqi Yang
Ziyao Zeng
Jianbo Shi
34
27
0
07 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
30
21
0
01 Dec 2022
JaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus
Tomohiko Nakamura
Shinnosuke Takamichi
Naoko Tanji
Satoru Fukayama
Hiroshi Saruwatari
23
4
0
29 Nov 2022
Mix and Localize: Localizing Sound Sources in Mixtures
Xixi Hu
Ziyang Chen
Andrew Owens
30
51
0
28 Nov 2022
Latent Iterative Refinement for Modular Source Separation
Dimitrios Bralios
Efthymios Tzinis
Gordon Wichern
Paris Smaragdis
Jonathan Le Roux
BDL
33
5
0
22 Nov 2022
Self-Remixing: Unsupervised Speech Separation via Separation and Remixing
Kohei Saijo
Tetsuji Ogawa
SSL
22
11
0
18 Nov 2022
Show Me the Instruments: Musical Instrument Retrieval from Mixture Audio
Kyungsuk Kim
Minju Park
Ha-na Joung
Yunkee Chae
Yeongbeom Hong
Seonghyeon Go
Kyogu Lee
11
6
0
15 Nov 2022
Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition
Zili Huang
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yiming Wang
Jinyu Li
Takuya Yoshioka
Xiaofei Wang
Peidong Wang
28
3
0
10 Nov 2022
Speech separation with large-scale self-supervised learning
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yu-Huan Wu
Xiaofei Wang
Takuya Yoshioka
Jinyu Li
S. Sivasankaran
Sefik Emre Eskimez
19
14
0
09 Nov 2022
BER: Balanced Error Rate For Speaker Diarization
Tao Liu
K. Yu
20
4
0
08 Nov 2022
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
Bowen Pang
Huan Zhao
Gaosheng Zhang
Xiaoyue Yang
Yanguo Sun
Li Zhang
Qing Wang
Linfu Xie
BDL
28
2
0
26 Oct 2022
Position tracking of a varying number of sound sources with sliding permutation invariant training
David Diaz-Guerra
A. Politis
Tuomas Virtanen
30
5
0
26 Oct 2022
Adversarial Permutation Invariant Training for Universal Sound Separation
Emilian Postolache
Jordi Pons
Santiago Pascual
Joan Serrà
VLM
28
6
0
21 Oct 2022
Deep Learning Based Stage-wise Two-dimensional Speaker Localization with Large Ad-hoc Microphone Arrays
Shupei Liu
Linfeng Feng
Yijun Gong
Chengdong Liang
Chen Zhang
Xiao-Lei Zhang
Xuelong Li
18
3
0
19 Oct 2022
Semi-supervised Time Domain Target Speaker Extraction with Attention
Zhepei Wang
Ritwik Giri
Shrikant Venkataramani
Umut Isik
J. Valin
Paris Smaragdis
Mike Goodwin
A. Krishnaswamy
24
7
0
18 Jun 2022
Heterogeneous Separation Consistency Training for Adaptation of Unsupervised Speech Separation
Jiangyu Han
Yanhua Long
28
6
0
23 Apr 2022
RadioSES: mmWave-Based Audioradio Speech Enhancement and Separation System
M. Z. Ozturk
Chenshu Wu
Beibei Wang
Min Wu
K. Liu
27
20
0
14 Apr 2022
Listen only to me! How well can target speech extraction handle false alarms?
Marc Delcroix
K. Kinoshita
Tsubasa Ochiai
Kateřina Žmolíková
Hiroshi Sato
Tomohiro Nakatani
34
15
0
11 Apr 2022
1
2
3
4
Next