ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1708.07524
  4. Cited By
Supervised Speech Separation Based on Deep Learning: An Overview

Supervised Speech Separation Based on Deep Learning: An Overview

24 August 2017
DeLiang Wang
Jitong Chen
    SSL
ArXivPDFHTML

Papers citing "Supervised Speech Separation Based on Deep Learning: An Overview"

50 / 219 papers shown
Title
ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior
ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior
Zhongweiyang Xu
Xulin Fan
Zhong-Qiu Wang
Xilin Jiang
Romit Roy Choudhury
DiffM
59
0
0
08 May 2025
ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability
ReverbMiipher: Generative Speech Restoration meets Reverberation Characteristics Controllability
Wataru Nakata
Yuma Koizumi
Shigeki Karita
Robin Scheibler
Haruko Ishikawa
Adriana Guevara-Rukoz
Heiga Zen
M. Bacchiani
58
0
0
08 May 2025
Listen to Extract: Onset-Prompted Target Speaker Extraction
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Zehao Wang
58
0
0
08 May 2025
A Synergistic Framework of Nonlinear Acoustic Computing and Reinforcement Learning for Real-World Human-Robot Interaction
A Synergistic Framework of Nonlinear Acoustic Computing and Reinforcement Learning for Real-World Human-Robot Interaction
Xiaoliang Chen
Xin Yu
Le Chang
Yunhe Huang
Jiashuai He
...
Jin Li
Likai Lin
Ziyu Zeng
Xianling Tu
Shuyu Zhang
51
0
0
04 May 2025
CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR
CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR
Nian Shao
Rui Zhou
Pengyu Wang
Xian Li
Ying Fang
Yujie Yang
Xiaofei Li
46
0
0
27 Feb 2025
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
David Perera
Victor Letzelter
Théo Mariotte
Adrien Cortés
Mickaël Chen
S. Essid
Ga¨el Richard
77
3
0
20 Jan 2025
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Microphone Array Signal Processing and Deep Learning for Speech Enhancement
Reinhold Haeb-Umbach
Tomohiro Nakatani
Marc Delcroix
Christoph Boeddeker
Tsubasa Ochiai
48
0
0
13 Jan 2025
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
FlowSep: Language-Queried Sound Separation with Rectified Flow Matching
Yi Yuan
Xubo Liu
Haohe Liu
Mark D. Plumbley
Wenwu Wang
65
3
0
10 Jan 2025
Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
Akam Rahimi
Triantafyllos Afouras
Andrew Zisserman
52
28
0
02 Jan 2025
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Kai Li
Wendi Sang
Chang Zeng
Runxuan Yang
Guo Chen
Xiaolin Hu
44
2
0
02 Oct 2024
A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation
A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation
Jingyuan Wang
Jie Zhang
Shihao Chen
Miao Sun
26
0
0
19 Sep 2024
Learning Source Disentanglement in Neural Audio Codec
Learning Source Disentanglement in Neural Audio Codec
Xiaoyu Bie
Xubo Liu
Gaël Richard
34
1
0
17 Sep 2024
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Wenze Ren
Haibin Wu
Yi-Cheng Lin
Xuanjun Chen
Rong-Yu Chao
Kuo-Hsuan Hung
You-Jin Li
Wen-Yuan Ting
Hsin-Min Wang
Yu Tsao
Mamba
49
0
0
16 Sep 2024
Language-Queried Target Sound Extraction Without Parallel Training Data
Language-Queried Target Sound Extraction Without Parallel Training Data
Hao Ma
Zhiyuan Peng
Xu Li
Yukai Li
Mingjie Shao
Qiuqiang Kong
Xuelong Li
VLM
80
1
0
14 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
Bang Zeng
Ming Li
45
3
0
04 Sep 2024
Serialized Speech Information Guidance with Overlapped Encoding
  Separation for Multi-Speaker Automatic Speech Recognition
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition
Hao Shi
Yuan Gao
Zhaoheng Ni
Tatsuya Kawahara
41
2
0
01 Sep 2024
Unsupervised Blind Joint Dereverberation and Room Acoustics Estimation with Diffusion Models
Unsupervised Blind Joint Dereverberation and Room Acoustics Estimation with Diffusion Models
Jean-Marie Lemercier
Eloi Moliner
Simon Welker
Vesa Valimaki
Timo Gerkmann
59
2
0
14 Aug 2024
Can all variations within the unified mask-based beamformer framework achieve identical peak extraction performance?
Can all variations within the unified mask-based beamformer framework achieve identical peak extraction performance?
Atsuo Hiroe
Katsutoshi Itoyama
Kazuhiro Nakadai
43
0
0
22 Jul 2024
Knowledge boosting during low-latency inference
Knowledge boosting during low-latency inference
Vidya Srinivas
Malek Itani
Tuochao Chen
Sefik Emre Eskimez
Takuya Yoshioka
Shyamnath Gollakota
39
2
0
09 Jul 2024
Enhancing spatial auditory attention decoding with neuroscience-inspired
  prototype training
Enhancing spatial auditory attention decoding with neuroscience-inspired prototype training
Zelin Qiu
Jianjun Gu
Dingding Yao
Junfeng Li
32
2
0
09 Jul 2024
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for
  Dynamic Speech Enhancement and Localization
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization
Bing Yang
Changsheng Quan
Yabo Wang
Pengyu Wang
Yujie Yang
Ying Fang
Nian Shao
Hui Bu
Xin Xu
Xiaofei Li
48
5
0
28 Jun 2024
SNR-Progressive Model with Harmonic Compensation for Low-SNR Speech
  Enhancement
SNR-Progressive Model with Harmonic Compensation for Low-SNR Speech Enhancement
Zhongshu Hou
Tong Lei
Qinwen Hu
Zhanzhong Cao
Ming Tang
Jing Lu
50
0
0
24 Jun 2024
Pre-training Feature Guided Diffusion Model for Speech Enhancement
Pre-training Feature Guided Diffusion Model for Speech Enhancement
Yiyuan Yang
Niki Trigoni
Andrew Markham
42
3
0
11 Jun 2024
An Investigation of Incorporating Mamba for Speech Enhancement
An Investigation of Incorporating Mamba for Speech Enhancement
Rong-Yu Chao
Wen-Huang Cheng
Moreno La Quatra
Sabato Marco Siniscalchi
Chao-Han Huck Yang
Szu-Wei Fu
Yu Tsao
Mamba
55
27
0
10 May 2024
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Ruijie Tao
Xinyuan Qian
Yidi Jiang
Junjie Li
Jiadong Wang
Haizhou Li
39
1
0
29 Apr 2024
Rethinking Processing Distortions: Disentangling the Impact of Speech
  Enhancement Errors on Speech Recognition Performance
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
Tsubasa Ochiai
Kazuma Iwamoto
Marc Delcroix
Rintaro Ikeshita
Hiroshi Sato
Shoko Araki
Shigeru Katagiri
31
3
0
23 Apr 2024
TRNet: Two-level Refinement Network leveraging Speech Enhancement for
  Noise Robust Speech Emotion Recognition
TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition
Chengxin Chen
Pengyuan Zhang
45
0
0
19 Apr 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
45
21
0
15 Apr 2024
Weakly-supervised Audio Separation via Bi-modal Semantic Similarity
Weakly-supervised Audio Separation via Bi-modal Semantic Similarity
Tanvir Mahmud
Saeed Amizadeh
K. Koishida
Diana Marculescu
AI4TS
26
2
0
02 Apr 2024
Towards Decoupling Frontend Enhancement and Backend Recognition in
  Monaural Robust ASR
Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
Yufeng Yang
Ashutosh Pandey
DeLiang Wang
46
4
0
11 Mar 2024
Objective and subjective evaluation of speech enhancement methods in the
  UDASE task of the 7th CHiME challenge
Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge
Simon Leglaive
Matthieu Fraticelli
Hend ElGhazaly
Léonie Borne
Mostafa Sadeghi
Scott Wisdom
Manuel Pariente
J. Hershey
Daniel Pressnitzer
Jon P. Barker
26
8
0
02 Feb 2024
An Analysis of the Variance of Diffusion-based Speech Enhancement
An Analysis of the Variance of Diffusion-based Speech Enhancement
Bunlong Lay
Timo Gerkmann
DiffM
27
0
0
01 Feb 2024
Decoupled Spatial and Temporal Processing for Resource Efficient
  Multichannel Speech Enhancement
Decoupled Spatial and Temporal Processing for Resource Efficient Multichannel Speech Enhancement
Ashutosh Pandey
Buye Xu
52
2
0
15 Jan 2024
Investigating the Design Space of Diffusion Models for Speech
  Enhancement
Investigating the Design Space of Diffusion Models for Speech Enhancement
Philippe Gonzalez
Zheng-Hua Tan
Jan Østergaard
Jesper Jensen
T. S. Alstrøm
Tobias May
DiffM
38
7
0
07 Dec 2023
Multi-channel Conversational Speaker Separation via Neural Diarization
Multi-channel Conversational Speaker Separation via Neural Diarization
H. Taherian
DeLiang Wang
BDL
44
16
0
15 Nov 2023
Speech enhancement with frequency domain auto-regressive modeling
Speech enhancement with frequency domain auto-regressive modeling
Anurenjan Purushothaman
Debottam Dutta
Rohit Kumar
Sriram Ganapathy
32
2
0
24 Sep 2023
Is the Ideal Ratio Mask Really the Best? -- Exploring the Best
  Extraction Performance and Optimal Mask of Mask-based Beamformers
Is the Ideal Ratio Mask Really the Best? -- Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers
Atsuo Hiroe
Katsutoshi Itoyama
Kazuhiro Nakadai
13
1
0
21 Sep 2023
Diffusion-based speech enhancement with a weighted generative-supervised
  learning loss
Diffusion-based speech enhancement with a weighted generative-supervised learning loss
Jean-Eudes Ayilo
Mostafa Sadeghi
Romain Serizel
DiffM
40
9
0
19 Sep 2023
PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch
  Interactions for Multi-Channel Speech Enhancement
PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch Interactions for Multi-Channel Speech Enhancement
Jia Pan
Shulin He
Tianci Wu
Hui Zhang
Xueliang Zhang
29
0
0
19 Sep 2023
Deep learning-based denoising streamed from mobile phones improves
  speech-in-noise understanding for hearing aid users
Deep learning-based denoising streamed from mobile phones improves speech-in-noise understanding for hearing aid users
P. U. Diehl
Hannes Zilly
Felix Sattler
Y. Singer
Kevin Kepp
...
Paul Meyer-Rachner
A. Pudszuhn
V. Hofmann
M. Vormann
Elias Sprengel
37
3
0
22 Aug 2023
Separate Anything You Describe
Separate Anything You Describe
Xubo Liu
Qiuqiang Kong
Yan Zhao
Haohe Liu
Yiitan Yuan
Yuzhuo Liu
Rui Xia
Yuxuan Wang
Mark D. Plumbley
Wenwu Wang
VLM
38
43
0
09 Aug 2023
Audio-visual video-to-speech synthesis with synthesized input audio
Audio-visual video-to-speech synthesis with synthesized input audio
Triantafyllos Kefalas
Yannis Panagakis
Maja Pantic
VGen
DiffM
40
1
0
31 Jul 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
36
9
0
18 Jun 2023
Multi-Loss Convolutional Network with Time-Frequency Attention for
  Speech Enhancement
Multi-Loss Convolutional Network with Time-Frequency Attention for Speech Enhancement
Liang Wan
Hongqing Liu
Yi Zhou
Jie Ji
48
2
0
15 Jun 2023
Unsupervised speech enhancement with deep dynamical generative speech
  and noise models
Unsupervised speech enhancement with deep dynamical generative speech and noise models
Xiaoyu Lin
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
29
3
0
13 Jun 2023
A Mask Free Neural Network for Monaural Speech Enhancement
A Mask Free Neural Network for Monaural Speech Enhancement
Liangqi Liu
Haixing Guan
Jinlong Ma
Wei Dai
Guang-Yi Wang
Shaowei Ding
34
11
0
07 Jun 2023
A Neural State-Space Model Approach to Efficient Speech Separation
A Neural State-Space Model Approach to Efficient Speech Separation
Chen Chen
Chao-Han Huck Yang
Kai Li
Yuchen Hu
Pin-Jui Ku
Chng Eng Siong
39
11
0
26 May 2023
Martian time-series unraveled: A multi-scale nested approach with
  factorial variational autoencoders
Martian time-series unraveled: A multi-scale nested approach with factorial variational autoencoders
Ali Siahkoohi
Rudy Morel
Randall Balestriero
Erwan Allys
G. Sainton
Taichi Kawamura
Maarten V. de Hoop
41
2
0
25 May 2023
Integrating Uncertainty into Neural Network-based Speech Enhancement
Integrating Uncertainty into Neural Network-based Speech Enhancement
Hu Fang
Dennis Becker
S. Wermter
Timo Gerkmann
UQCV
37
2
0
15 May 2023
Inter-SubNet: Speech Enhancement with Subband Interaction
Inter-SubNet: Speech Enhancement with Subband Interaction
Jun Chen
Wei Rao
Zehao Wang
Jiuxin Lin
Zhiyong Wu
Yannan Wang
Shidong Shang
Helen M. Meng
21
13
0
09 May 2023
12345
Next