v1v2v3 (latest)

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

20 September 2018

Papers citing "Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation"

50 / 773 papers shown

Title
Improving Source Extraction with Diffusion and Consistency Models Tornike Karchkhadze M. Izadi Shuo Zhang DiffM 145 1 0 09 Dec 2024
Multiple Choice Learning for Efficient Speech Separation with Many Speakers David Perera François Derrida Théo Mariotte Gaël Richard S. Essid 106 0 0 27 Nov 2024
State-Space Large Audio Language Models Saurabhchand Bhati Yuan Gong Leonid Karlinsky Hilde Kuehne Rogerio Feris James Glass 151 1 0 24 Nov 2024
Speech Separation with Pretrained Frontend to Minimize Domain Mismatch Wupeng Wang Zexu Pan Xianrui Li Shuai Wang Haoyang Li 78 4 0 05 Nov 2024
Task-Aware Unified Source Separation Kohei Saijo Janek Ebbers François Germain Gordon Wichern Jonathan Le Roux 75 2 0 31 Oct 2024
Joint Beamforming and Speaker-Attributed ASR for Real Distant-Microphone Meeting Transcription Can Cui Imran A. Sheikh Mostafa Sadeghi Emmanuel Vincent 91 0 0 29 Oct 2024
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis Luca Jiang-Tao Yu Running Zhao Sijie Ji Edith C.H. Ngai Chenshu Wu 52 0 0 29 Oct 2024
SepMamba: State-space models for speaker separation using Mamba Thor Højhus Avenstrup Boldizsár Elek István László Mádi András Bence Schin Morten Mørup Bjørn Sand Jensen Kenny Falkær Olsen Mamba 55 0 0 28 Oct 2024
CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning Sjoerd Groot Qinyu Chen Jan C. van Gemert Chang Gao Mamba 458 0 0 14 Oct 2024
In-Materia Speech Recognition Mohamadreza Zolfagharinejad Julian Büchel Lorenzo Cassola Sachin Kinge Ghazi Sarwat Syed Abu Sebastian Wilfred G. van der Wiel 71 0 0 14 Oct 2024
TrustEMG-Net: Using Representation-Masking Transformer with U-Net for Surface Electromyography Enhancement Kuan-Chen Wang Kai-Chun Liu Ping-Cheng Yeh Sheng-Yu Peng Yu Tsao 54 1 0 04 Oct 2024
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios Kai Li Wendi Sang Chang Zeng Runxuan Yang Guo Chen Xiaolin Hu 128 3 0 02 Oct 2024
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation Mohan Xu Kai Li Guo Chen Xiaolin Hu 87 2 0 02 Oct 2024
Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds Hanbin Bae Pavel Andreev Azat Saginbaev Nicholas Babaev Won-Jun Lee Hosang Sung Hoon-Young Cho 68 0 0 27 Sep 2024
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration Pin-Jui Ku Alexander H. Liu Roman Korostik Sung-Feng Huang Szu-Wei Fu Ante Jukić 77 4 0 24 Sep 2024
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction Shuai Wang Ke Zhang Shaoxiong Lin Junjie Li Xuefei Wang Meng Ge Jianwei Yu Yanmin Qian Haizhou Li 75 10 0 24 Sep 2024
GALD-SE: Guided Anisotropic Lightweight Diffusion for Efficient Speech Enhancement Chengzhong Wang Jianjun Gu Dingding Yao Junfeng Li Yonghong Yan DiffM 474 0 0 23 Sep 2024
Leveraging Audio-Only Data for Text-Queried Target Sound Extraction Kohei Saijo Janek Ebbers François Germain Sameer Khurana Gordon Wichern Jonathan Le Roux 99 1 0 20 Sep 2024
SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model Carlos Hernandez-Olivan Marc Delcroix Tsubasa Ochiai Daisuke Niizumi Naohiro Tawara Tomohiro Nakatani Shoko Araki 54 2 0 19 Sep 2024
Geometry-Constrained EEG Channel Selection for Brain-Assisted Speech Enhancement Keying Zuo Qingtian Xu Jie Zhang Zhenhua Ling 93 0 0 19 Sep 2024
A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation Jingyuan Wang Jie Zhang Shihao Chen Miao Sun 74 0 0 19 Sep 2024
Learning Source Disentanglement in Neural Audio Codec Xiaoyu Bie Xubo Liu Gaël Richard 97 2 0 17 Sep 2024
Ultra-Low Latency Speech Enhancement - A Comprehensive Study Haibin Wu Sebastian Braun 100 0 0 16 Sep 2024
Language-Queried Target Sound Extraction Without Parallel Training Data Hao Ma Zhiyuan Peng Xu Li Yukai Li Mingjie Shao Qiuqiang Kong Xuelong Li VLM 185 2 0 14 Sep 2024
Biomimetic Frontend for Differentiable Audio Processing Ruolan Leslie Famularo D. Zotkin S. Shamma R. Duraiswami AI4TS 78 0 0 13 Sep 2024
TSELM: Target Speaker Extraction using Discrete Tokens and Language Models Beilong Tang Bang Zeng Ming Li 80 4 0 12 Sep 2024
DENSE: Dynamic Embedding Causal Target Speech Extraction Yiwen Wang Zeyu Yuan Xihong Wu 71 0 0 10 Sep 2024
The first Cadenza challenges: using machine learning competitions to improve music for listeners with a hearing loss Gerardo Roa Dabike Michael A. Akeroyd Scott Bannister Jon P. Barker Trevor J. Cox ... Jennifer Firth S. Graetzer Alinka Greasley Rebecca R. Vos W. Whitmer 61 1 0 08 Sep 2024
Mel-RoFormer for Vocal Separation and Vocal Melody Transcription Ju-Chiang Wang Fan Zhang Jitong Chen 63 2 0 07 Sep 2024
NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention Dashanka De Silva Siqi Cai Saurav Pahuja Tanja Schultz Haizhou Li 119 0 0 04 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction Bang Zeng Ming Li 107 5 0 04 Sep 2024
Spectron: Target Speaker Extraction using Conditional Transformer with Adversarial Refinement Tathagata Bandyopadhyay ViT 85 0 0 02 Sep 2024
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization Zengrui Jin Yifan Yang Mohan Shi Wei Kang Xiaoyu Yang ... Lingwei Meng Long Lin Yong Xu Shi-Xiong Zhang Daniel Povey 76 3 0 01 Sep 2024
Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation Kai Chen Jiaqi Su Taylor Berg-Kirkpatrick Shlomo Dubnov Zeyu Jin 63 2 0 28 Aug 2024
Comparative Analysis Of Discriminative Deep Learning-Based Noise Reduction Methods In Low SNR Scenarios Shrishti Saha Shetu Emanuël A. P. Habets Andreas Brendel 70 2 0 26 Aug 2024
Efficient Area-based and Speaker-Agnostic Source Separation Martin Strauss Okan Kopuklu 60 3 0 19 Aug 2024
DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement Tao Sun Sander Bohté 50 4 0 14 Aug 2024
BSS-CFFMA: Cross-Domain Feature Fusion and Multi-Attention Speech Enhancement Network based on Self-Supervised Embedding Alimjan Mattursun Liejun Wang Yinfeng Yu 116 2 0 13 Aug 2024
Source Separation of Multi-source Raw Music using a Residual Quantized Variational Autoencoder Leonardo Berti DRL 65 0 0 12 Aug 2024
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement Kohei Saijo Gordon Wichern François G. Germain Zexu Pan Jonathan Le Roux 65 9 0 06 Aug 2024
Enhanced Reverberation as Supervision for Unsupervised Speech Separation Kohei Saijo Gordon Wichern François G. Germain Zexu Pan Jonathan Le Roux 79 1 0 06 Aug 2024
RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues Tianrui Pan Jie Liu Bohan Wang Jie Tang Gangshan Wu 85 2 0 27 Jul 2024
Robustness of Speech Separation Models for Similar-pitch Speakers Bunlong Lay Sebastian Zaczek Kristina Tesch Timo Gerkmann 35 0 0 22 Jul 2024
Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis Xilin Jiang Yinghao Aaron Li Adrian Nicolas Florea Cong Han N. Mesgarani Mamba 97 14 0 13 Jul 2024
A review of graph neural network applications in mechanics-related domains Yingxue Zhao Haoran Li Haosu Zhou H. Attar Tobias Pfaff Nan Li AI4CE 134 8 0 10 Jul 2024
Improving Speech Enhancement by Integrating Inter-Channel and Band Features with Dual-branch Conformer Jizhen Li Xinmeng Xu Weiping Tu Yuhong Yang Rong Zhu 100 1 0 09 Jul 2024
A Reference-free Metric for Language-Queried Audio Source Separation using Contrastive Language-Audio Pretraining Feiyang Xiao Jian Guan Qiaoxi Zhu Xubo Liu Wenbo Wang Shuhan Qi Kejia Zhang Jianyuan Sun Wenwu Wang 77 7 0 06 Jul 2024
All Neural Low-latency Directional Speech Extraction Ashutosh Pandey Sanha Lee Juan Azcarreta Daniel D. E. Wong Buye Xu 52 3 0 05 Jul 2024
Investigating the Effects of Large-Scale Pseudo-Stereo Data and Different Speech Foundation Model on Dialogue Generative Spoken Language Model Yu-Kuan Fu Cheng-Kuang Lee Hsiu-Hsuan Wang Hung-yi Lee 54 0 0 02 Jul 2024
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling Hiroshi Sato Takafumi Moriya Masato Mimura Shota Horiguchi Tsubasa Ochiai Takanori Ashihara Atsushi Ando Kentaro Shinayama Marc Delcroix 72 2 0 01 Jul 2024