End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

30 October 2019

Papers citing "End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation"

50 / 93 papers shown

Title
ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior Zhongweiyang Xu Xulin Fan Zhong-Qiu Wang Xilin Jiang Romit Roy Choudhury DiffM 46 0 0 08 May 2025
Elevating Robust Multi-Talker ASR by Decoupling Speaker Separation and Speech Recognition Yufeng Yang H. Taherian Vahid Ahmadi Kalkhorani DeLiang Wang 44 0 0 23 Mar 2025
UniArray: Unified Spectral-Spatial Modeling for Array-Geometry-Agnostic Speech Separation Weiguang Chen Junjie Zhang Jielong Yang Eng Siong Chng Xionghu Zhong 68 0 0 07 Mar 2025
EDSep: An Effective Diffusion-Based Method for Speech Source Separation Jinwei Dong Xinsheng Wang Qirong Mao 63 0 0 28 Jan 2025
Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization Yoshiki Masuyama G. Wichern François Germain Christopher Ick Jonathan Le Roux 80 0 0 22 Jan 2025
30+ Years of Source Separation Research: Achievements and Future Challenges S. Araki N. Ito Reinhold Haeb-Umbach G. Wichern Zhong-Qiu Wang Yuki Mitsufuji AI4TS 39 0 0 21 Jan 2025
Joint Beamforming and Speaker-Attributed ASR for Real Distant-Microphone Meeting Transcription Can Cui Imran A. Sheikh Mostafa Sadeghi Emmanuel Vincent 34 0 0 29 Oct 2024
A Two-Stage Band-Split Mamba-2 Network For Music Separation Jinglin Bai Yuan Fang Jiajie Wang Xueliang Zhang Mamba 27 1 0 10 Sep 2024
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization Bing Yang Changsheng Quan Yabo Wang Pengyu Wang Yujie Yang Ying Fang Nian Shao Hui Bu Xin Xu Xiaofei Li 43 5 0 28 Jun 2024
Binaural Selective Attention Model for Target Speaker Extraction Hanyu Meng Qiquan Zhang Xiangyu Zhang V. Sethu Eliathamby Ambikairajah 33 1 0 18 Jun 2024
Neural Blind Source Separation and Diarization for Distant Speech Recognition Yoshiaki Bando Tomohiko Nakamura Shinji Watanabe BDL 34 5 0 12 Jun 2024
IPDnet: A Universal Direct-Path IPD Estimation Network for Sound Source Localization Ya-Bin Wang Bing Yang Xiaofei Li 45 1 0 11 May 2024
Gull: A Generative Multifunctional Audio Codec Yi Luo Jianwei Yu Hangting Chen Rongzhi Gu Chao Weng AuLLM 41 3 0 07 Apr 2024
CrossNet: Leveraging Global, Cross-Band, Narrow-Band, and Positional Encoding for Single- and Multi-Channel Speaker Separation Vahid Ahmadi Kalkhorani DeLiang Wang 40 3 0 06 Mar 2024
Channel-Combination Algorithms for Robust Distant Voice Activity and Overlapped Speech Detection Théo Mariotte Anthony Larcher Silvio Montrésor Jean-Hugh Thomas 27 2 0 13 Feb 2024
Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers Marvin Tammen Tsubasa Ochiai Marc Delcroix Tomohiro Nakatani S. Araki Simon Doclo 18 0 0 05 Feb 2024
Improving Design of Input Condition Invariant Speech Enhancement Wangyou Zhang Jee-weon Jung Shinji Watanabe Yanmin Qian AAML 28 3 0 25 Jan 2024
Decoupled Spatial and Temporal Processing for Resource Efficient Multichannel Speech Enhancement Ashutosh Pandey Buye Xu 40 1 0 15 Jan 2024
Binaural multichannel blind speaker separation with a causal low-latency and low-complexity approach Nils L. Westhausen Bernd T. Meyer BDL 35 3 0 08 Dec 2023
Subnetwork-to-go: Elastic Neural Network with Dynamic Training and Customizable Inference Kai Li Yi Luo 26 2 0 06 Dec 2023
Toward Universal Speech Enhancement for Diverse Input Conditions Wangyou Zhang Kohei Saijo Zhong-Qiu Wang Shinji Watanabe Yanmin Qian VLM 27 18 0 29 Sep 2023
Multichannel Voice Trigger Detection Based on Transform-average-concatenate Takuya Higuchi Avamarie Brueggeman Han Fang Stephen Shum 18 0 0 27 Sep 2023
Addressing Feature Imbalance in Sound Source Separation Jaechang Kim Jeongyeon Hwang Soheun Yi Jaewoong Cho Jungseul Ok 22 0 0 11 Sep 2023
ReZero: Region-customizable Sound Extraction Rongzhi Gu Yi Luo 32 12 0 31 Aug 2023
Dual-path Transformer Based Neural Beamformer for Target Speech Extraction Aoqi Guo Sichong Qian Baoxiang Li Dazhi Gao 31 1 0 30 Aug 2023
Enhancing Mobile Privacy and Security: A Face Skin Patch-Based Anti-Spoofing Approach Qiushi Guo PICV CVBM 22 2 0 09 Aug 2023
SpatialNet: Extensively Learning Spatial Information for Multichannel Joint Speech Separation, Denoising and Dereverberation Changsheng Quan Xiaofei Li 18 35 0 31 Jul 2023
Spatial-temporal Graph Based Multi-channel Speaker Verification With Ad-hoc Microphone Arrays Yijiang Chen Chen Liang Xiao-Lei Zhang 28 1 0 03 Jul 2023
Graph neural networks for sound source localization on distributed microphone networks Eric Grinstein Mike Brookes Patrick A. Naylor 6 8 0 28 Jun 2023
Unsupervised Multi-channel Separation and Adaptation Cong Han K. Wilson Scott Wisdom J. Hershey 18 4 0 18 May 2023
Inter-SubNet: Speech Enhancement with Subband Interaction Jun Chen Wei Rao Z. Wang Jiuxin Lin Zhiyong Wu Yannan Wang Shidong Shang Helen M. Meng 11 13 0 09 May 2023
Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters Kristina Tesch Timo Gerkmann 24 16 0 24 Apr 2023
Fast Random Approximation of Multi-channel Room Impulse Response Yi Luo Rongzhi Gu 12 4 0 17 Apr 2023
Guided Speech Enhancement Network Yang Yang Shao-fu Shih Hakan Erdogan J. Lin C. Lee Yunpeng Li George Sung Matthias Grundmann 28 6 0 13 Mar 2023
DFSNet: A Steerable Neural Beamformer Invariant to Microphone Array Configuration for Real-Time, Low-Latency Speech Enhancement A. Kovalyov Kashyap Patel Issa Panahi 26 3 0 26 Feb 2023
DasFormer: Deep Alternating Spectrogram Transformer for Multi/Single-Channel Speech Separation Shuo Wang Xiangyu Kong Xiulian Peng H. Movassagh Vinod Prakash Yan Lu 26 11 0 21 Feb 2023
Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation Rongzhi Gu Shi-Xiong Zhang Yuexian Zou Dong Yu AI4TS 22 24 0 16 Dec 2022
DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement Dongheon Lee Jung-Woo Choi 27 25 0 15 Dec 2022
MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation Yanjie Fu Haoran Yin Meng Ge Longbiao Wang Gaoyan Zhang J. Dang Chengyun Deng Fei Wang CVBM 18 2 0 07 Dec 2022
NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer Changsheng Quan Xiaofei Li 30 2 0 05 Dec 2022
A General Unfolding Speech Enhancement Method Motivated by Taylor's Theorem Andong Li Guochen Yu C. Zheng Wenzhe Liu Xiaodong Li 43 10 0 30 Nov 2022
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation Zhongqiu Wang Samuele Cornell Shukjae Choi Younglo Lee Byeonghak Kim Shinji Watanabe 32 119 0 22 Nov 2022
TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary Perspective Andong Li Guochen Yu Wenzhe Liu Xiaodong Li C. Zheng 22 2 0 22 Nov 2022
A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings Mohan Shi Jie Zhang Zhihao Du Fan Yu Qian Chen Shiliang Zhang Lirong Dai 51 4 0 01 Nov 2022
TT-Net: Dual-path transformer based sound field translation in the spherical harmonic domain Yiwen Wang Zijian Lan Xihong Wu T. Qu 15 1 0 30 Oct 2022
Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization Shota Horiguchi Yuki Takashima Shinji Watanabe Leibny Paola García-Perera 36 2 0 07 Oct 2022
VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition Naoyuki Kanda Jian Wu Xiaofei Wang Zhuo Chen Jinyu Li Takuya Yoshioka 29 16 0 12 Sep 2022
Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain Dejan Marković Alexandre Défossez Alexander Richard 16 16 0 30 Jun 2022
ClearBuds: Wireless Binaural Earbuds for Learning-Based Speech Enhancement Ishan Chatterjee Maruchi Kim V. Jayaram Shyamnath Gollakota Ira Kemelmacher-Shlizerman Shwetak N. Patel S. M. Seitz 18 25 0 27 Jun 2022
Insights Into Deep Non-linear Filters for Improved Multi-channel Speech Enhancement Kristina Tesch Timo Gerkmann 25 58 0 27 Jun 2022