Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.02508
Cited By
SDR - half-baked or well done?
6 November 2018
F. Sánchez-Martínez
M. Esplà-Gomis
Hakan Erdogan
J. Hershey
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SDR - half-baked or well done?"
50 / 614 papers shown
Title
Source Separation by Flow Matching
Robin Scheibler
John R. Hershey
Arnaud Doucet
Henry Li
5
0
0
22 May 2025
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers
Yuzhu Wang
A. Politis
K. Drossos
Tuomas Virtanen
5
0
0
22 May 2025
A Novel Deep Learning Framework for Efficient Multichannel Acoustic Feedback Control
Yuan-Kuei Wu
Juan Azcarreta
Kashyap Patel
Buye Xu
Jung-Suk Lee
Sanha Lee
Ashutosh Pandey
5
0
0
21 May 2025
Single-Channel Target Speech Extraction Utilizing Distance and Room Clues
Runwu Shi
Zirui Lin
Benjamin Yen
Jiang Wang
Ragib Amin Nihal
Kazuhiro Nakadai
3DV
32
0
0
20 May 2025
Time-Frequency-Based Attention Cache Memory Model for Real-Time Speech Separation
Guo Chen
Kai Li
Runxuan Yang
Xiaolin Hu
AI4TS
14
0
0
19 May 2025
Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement
Ziling Huang
Haixin Guan
Yanhua Long
19
0
0
18 May 2025
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Zehao Wang
55
0
0
08 May 2025
ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior
Zhongweiyang Xu
Xulin Fan
Zhong-Qiu Wang
Xilin Jiang
Romit Roy Choudhury
DiffM
59
0
0
08 May 2025
The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
Bernardo Torres
Geoffroy Peeters
G. Richard
46
0
0
06 May 2025
SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation
Zhaoxi Mu
Xinyu Yang
Gang Wang
AuLLM
KELM
VLM
60
0
0
06 May 2025
Knowledge Distillation for Speech Denoising by Latent Representation Alignment with Cosine Distance
Diep Luong
Mikko Heikkinen
K. Drossos
Tuomas Virtanen
56
0
0
06 May 2025
A Comparative Study on Positional Encoding for Time-frequency Domain Dual-path Transformer-based Source Separation Models
Kohei Saijo
Tetsuji Ogawa
52
1
0
28 Apr 2025
Unleashing the Power of Natural Audio Featuring Multiple Sound Sources
Xize Cheng
Slytherin Wang
Zehan Wang
Rongjie Huang
Tao Jin
Zhou Zhao
47
0
0
24 Apr 2025
Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes
Binh Thien Nguyen
Masahiro Yasuda
Daiki Takeuchi
Daisuke Niizumi
Yasunori Ohishi
Noboru Harada
44
0
0
28 Mar 2025
Wireless Hearables With Programmable Speech AI Accelerators
Malek Itani
Tuochao Chen
Arun Raghavan
Gavriel Kohlberg
Shyamnath Gollakota
AuLLM
59
0
0
24 Mar 2025
Elevating Robust Multi-Talker ASR by Decoupling Speaker Separation and Speech Recognition
Yufeng Yang
H. Taherian
Vahid Ahmadi Kalkhorani
DeLiang Wang
46
0
0
23 Mar 2025
Bayesian Cox model with graph-structured variable selection priors for multi-omics biomarker identification
Tobias Østmo Hermansen
M. Zucknick
Zhi Zhao
58
0
0
17 Mar 2025
A Comparative Study of Invariance-Aware Loss Functions for Deep Learning-based Gridless Direction-of-Arrival Estimation
Kuan-Lin Chen
Bhaskar D. Rao
67
1
0
16 Mar 2025
MACS: Multi-source Audio-to-image Generation with Contextual Significance and Semantic Alignment
Hao Zhou
Xiaobao Guo
Yuzhe Zhu
A. Kong
DiffM
65
1
0
13 Mar 2025
Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction
Minsu Kim
Rodrigo Mira
Honglie Chen
Stavros Petridis
Maja Pantic
71
0
0
13 Mar 2025
UniArray: Unified Spectral-Spatial Modeling for Array-Geometry-Agnostic Speech Separation
Weiguang Chen
Junjie Zhang
Jielong Yang
Eng Siong Chng
Xionghu Zhong
68
0
0
07 Mar 2025
FlowDec: A flow-based full-band general audio codec with high perceptual quality
Simon Welker
Matthew Le
Ricky T. Q. Chen
Wei-Ning Hsu
Timo Gerkmann
Alexander Richard
Yi-Chiao Wu
63
0
0
03 Mar 2025
AAD-LLM: Neural Attention-Driven Auditory Scene Understanding
Xilin Jiang
Sukru Samet Dindar
Vishal B. Choudhari
Stephan Bickel
A. Mehta
Guy M McKhann
A. Flinker
D. Friedman
N. Mesgarani
39
2
0
24 Feb 2025
Improving Speech Enhancement by Cross- and Sub-band Processing with State Space Model
Jizhen Li
Weiping Tu
Yuhong Yang
Xinmeng Xu
Yiqun Zhang
Yanzhen Ren
Mamba
40
0
0
22 Feb 2025
RestoreGrad: Signal Restoration Using Conditional Denoising Diffusion Models with Jointly Learned Prior
Ching Hua Lee
Chouchang Yang
Jaejin Cho
Yashas Malur Saidutta
R. S. Srinivasa
Yilin Shen
Hongxia Jin
DiffM
88
0
0
19 Feb 2025
ComplexDec: A Domain-robust High-fidelity Neural Audio Codec with Complex Spectrum Modeling
Yi-Chiao Wu
Dejan Marković
Steven Krenn
I. D. Gebru
Alexander Richard
66
1
0
04 Feb 2025
EDSep: An Effective Diffusion-Based Method for Speech Source Separation
Jinwei Dong
Xinsheng Wang
Qirong Mao
68
0
0
28 Jan 2025
SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling
Shengshi Yao
Jincheng Dai
Xiaoqi Qin
Sixian Wang
Siye Wang
K. Niu
Ping Zhang
40
0
0
22 Jan 2025
Speech Enhancement with Overlapped-Frame Information Fusion and Causal Self-Attention
Yuewei Zhang
Huanbin Zou
Jie Zhu
46
0
0
21 Jan 2025
30+ Years of Source Separation Research: Achievements and Future Challenges
S. Araki
N. Ito
Reinhold Haeb-Umbach
Gordon Wichern
Zhong-Qiu Wang
Yuki Mitsufuji
AI4TS
44
0
0
21 Jan 2025
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
David Perera
Victor Letzelter
Théo Mariotte
Adrien Cortés
Mickaël Chen
S. Essid
Ga¨el Richard
77
3
0
20 Jan 2025
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
45
6
0
17 Jan 2025
Sanidha: A Studio Quality Multi-Modal Dataset for Carnatic Music
Venkatakrishnan Vaidyanathapuram Krishnan
Noel Alben
Anish Nair
Nathaniel Condit-Schultz
46
0
0
12 Jan 2025
AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder
Samir Sadok
Simon Leglaive
Laurent Girin
Gaël Richard
Xavier Alameda-Pineda
58
1
0
10 Jan 2025
Apollo: Band-sequence Modeling for High-Quality Audio Restoration
Kai Li
Yi Luo
36
0
0
08 Jan 2025
Single-Channel Distance-Based Source Separation for Mobile GPU in Outdoor and Indoor Environments
Hanbin Bae
Byungjun Kang
Jiwon Kim
Jaeyong Hwang
Hosang Sung
Hoon-Young Cho
3DV
33
0
0
06 Jan 2025
Distance Based Single-Channel Target Speech Extraction
Runwu Shi
Benjamin Yen
Kazuhiro Nakadai
35
1
0
31 Dec 2024
Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models
Tornike Karchkhadze
M. Izadi
Shlomo Dubnov
DiffM
47
2
0
31 Dec 2024
Improving Source Extraction with Diffusion and Consistency Models
Tornike Karchkhadze
M. Izadi
Shuo Zhang
DiffM
90
1
0
09 Dec 2024
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Julian Parker
Anton Smirnov
Jordi Pons
CJ Carr
Zack Zukowski
Zach Evans
Xubo Liu
79
12
0
29 Nov 2024
Multiple Choice Learning for Efficient Speech Separation with Many Speakers
David Perera
François Derrida
Théo Mariotte
Gaël Richard
S. Essid
69
0
0
27 Nov 2024
GhostRNN: Reducing State Redundancy in RNN with Cheap Operations
Hang Zhou
Xiaoxu Zheng
Yunhe Wang
Michael Bi Mi
Deyi Xiong
Kai Han
69
0
0
20 Nov 2024
SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features
Yu-Fei Shi
Yang Ai
Ye-Xin Lu
Hui-Peng Du
Zhen-Hua Ling
41
0
0
18 Nov 2024
Joint Beamforming and Speaker-Attributed ASR for Real Distant-Microphone Meeting Transcription
Can Cui
Imran A. Sheikh
Mostafa Sadeghi
Emmanuel Vincent
39
0
0
29 Oct 2024
SepMamba: State-space models for speaker separation using Mamba
Thor Højhus Avenstrup
Boldizsár Elek
István László Mádi
András Bence Schin
Morten Mørup
Bjørn Sand Jensen
Kenny Falkær Olsen
Mamba
35
0
0
28 Oct 2024
Non-intrusive Speech Quality Assessment with Diffusion Models Trained on Clean Speech
Danilo de Oliveira
Julius Richter
Jean-Marie Lemercier
Simon Welker
Timo Gerkmann
DiffM
18
2
0
23 Oct 2024
Align-ULCNet: Towards Low-Complexity and Robust Acoustic Echo and Noise Reduction
Shrishti Saha Shetu
Naveen Kumar Desiraju
Wolfgang Mack
Emanuël A. P. Habets
32
0
0
17 Oct 2024
Enhancing Crowdsourced Audio for Text-to-Speech Models
José Giraldo
Martí Llopart-Font
Alex Peiró-Lilja
Carme Armentano-Oller
Gerard Sant
Baybars Külebi
DiffM
31
0
0
17 Oct 2024
Using RLHF to align speech enhancement approaches to mean-opinion quality scores
Anurag Kumar
Andrew Perrault
Donald S. Williamson
24
0
0
17 Oct 2024
Investigation of Speaker Representation for Target-Speaker Speech Processing
Takanori Ashihara
Takafumi Moriya
Shota Horiguchi
Junyi Peng
Tsubasa Ochiai
Marc Delcroix
Kohei Matsuura
Hiroshi Sato
31
1
0
15 Oct 2024
1
2
3
4
...
11
12
13
Next