Attention is All You Need in Speech Separation

25 October 2020

Mirco Ravanelli

Papers citing "Attention is All You Need in Speech Separation"

50 / 219 papers shown

Title
Latent Iterative Refinement for Modular Source Separation Dimitrios Bralios Efthymios Tzinis G. Wichern Paris Smaragdis Jonathan Le Roux BDL 33 5 0 22 Nov 2022
Hybrid Transformers for Music Source Separation Simon Rouard Francisco Massa Alexandre Défossez 16 128 0 15 Nov 2022
Handling Trade-Offs in Speech Separation with Sparsely-Gated Mixture of Experts Xiaofei Wang Zhuo Chen Yu Shi Jian Wu Naoyuki Kanda Takuya Yoshioka MoE 21 1 0 11 Nov 2022
Speech separation with large-scale self-supervised learning Zhuo Chen Naoyuki Kanda Jian Wu Yu-Huan Wu Xiaofei Wang Takuya Yoshioka Jinyu Li S. Sivasankaran Sefik Emre Eskimez 19 14 0 09 Nov 2022
Real-Time Target Sound Extraction Bandhav Veluri Justin Chan Malek Itani Tuochao Chen Takuya Yoshioka Shyamnath Gollakota 36 30 0 04 Nov 2022
Diffusion-based Generative Speech Source Separation Robin Scheibler Youna Ji Soo-Whan Chung J. Byun Soyeon Choe Min-Seok Choi DiffM 24 39 0 31 Oct 2022
CasNet: Investigating Channel Robustness for Speech Separation Fan Wang Yao-Fei Cheng Hung-Shin Lee Yu Tsao Hsin-Min Wang 20 2 0 27 Oct 2022
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation William Ravenscroft Stefan Goetze Thomas Hain 27 11 0 27 Oct 2022
Individualized Conditioning and Negative Distances for Speaker Separation Tao Sun Nidal Abuhajar Shuyu Gong Zhewei Wang Charles D. Smith Xianhui Wang Li Xu Jundong Liu VLM 24 1 0 12 Oct 2022
Music Source Separation with Band-split RNN Yi Luo Jianwei Yu 54 107 0 30 Sep 2022
TF-GridNet: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation Zhong-Qiu Wang Samuele Cornell Shukjae Choi Younglo Lee Byeonghak Kim Shinji Watanabe 74 96 0 08 Sep 2022
Analysis of impact of emotions on target speech extraction and speech separation Jan vSvec Katevrina vZmolíková M. Kocour Marc Delcroix Tsubasa Ochiai Ladislav Movsner JanHonza'' vCernocký 15 4 0 15 Aug 2022
Spatial Aware Multi-Task Learning Based Speech Separation Wei Sun Mei Wang L. Qiu 11 3 0 20 Jul 2022
Dual-Path Cross-Modal Attention for better Audio-Visual Speech Extraction Zhongweiyang Xu Xulin Fan M. Hasegawa-Johnson 11 2 0 09 Jul 2022
Learning to Separate Voices by Spatial Regions Alan Xu Romit Roy Choudhury 31 10 0 09 Jul 2022
Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech Separation Jian Luo Jianzong Wang Ning Cheng Edward Xiao Xulong Zhang Jing Xiao ViT 19 12 0 28 Jun 2022
ClearBuds: Wireless Binaural Earbuds for Learning-Based Speech Enhancement Ishan Chatterjee Maruchi Kim V. Jayaram Shyamnath Gollakota Ira Kemelmacher-Shlizerman Shwetak N. Patel S. M. Seitz 10 25 0 27 Jun 2022
A two-stage full-band speech enhancement model with effective spectral compression mapping Zhongshu Hou Qi Hu Kai-Jyun Chen Jing Lu 31 0 0 27 Jun 2022
Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes Danilo de Oliveira Tal Peer Timo Gerkmann 13 18 0 23 Jun 2022
Resource-Efficient Separation Transformer Luca Della Libera Cem Subakan Mirco Ravanelli Samuele Cornell Frédéric Lepoutre François Grondin VLM 35 15 0 19 Jun 2022
Semi-supervised Time Domain Target Speaker Extraction with Attention Zhepei Wang Ritwik Giri Shrikant Venkataramani Umut Isik J. Valin Paris Smaragdis Mike Goodwin A. Krishnaswamy 16 7 0 18 Jun 2022
On the Use of Deep Mask Estimation Module for Neural Source Separation Systems Kai Li Xiaolin Hu Yi Luo 12 15 0 15 Jun 2022
On the Design and Training Strategies for RNN-based Online Neural Speech Separation Systems Kai Li Yi Luo 24 11 0 15 Jun 2022
SepIt: Approaching a Single Channel Speech Separation Bound Shahar Lutati Eliya Nachmani Lior Wolf VLM 43 27 0 24 May 2022
Deep Learning and Synthetic Media Raphaël Millière 18 18 0 11 May 2022
Heterogeneous Separation Consistency Training for Adaptation of Unsupervised Speech Separation Jiangyu Han Yanhua Long 25 6 0 23 Apr 2022
STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency Zhong-Qiu Wang G. Wichern Shinji Watanabe Jonathan Le Roux 19 36 0 21 Apr 2022
RadioSES: mmWave-Based Audioradio Speech Enhancement and Separation System M. Z. Ozturk Chenshu Wu Beibei Wang Min Wu K. Liu 24 20 0 14 Apr 2022
Multichannel Speech Separation with Narrow-band Conformer Changsheng Quan Xiaofei Li 31 12 0 09 Apr 2022
Phase-Aware Deep Speech Enhancement: It's All About The Frame Length Tal Peer Timo Gerkmann 14 21 0 30 Mar 2022
Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks Fan Wang Hung-Shin Lee Yu Tsao Hsin-Min Wang 21 4 0 30 Mar 2022
Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation Xue Yang C. Bao 27 3 0 25 Mar 2022
Improving the transferability of speech separation by meta-learning Kuan-Po Huang Yuan-Kuei Wu Hung-yi Lee 35 1 0 11 Mar 2022
MANNER: Multi-view Attention Network for Noise Erasure Hyun Joon Park Byung Ha Kang Wooseok Shin Jin Sob Kim S. W. Han 30 48 0 04 Mar 2022
RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing Efthymios Tzinis Yossi Adi V. Ithapu Buye Xu Paris Smaragdis Anurag Kumar CLL 22 54 0 17 Feb 2022
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training Ertuğ Karamatlı S. Kırbız SSL 36 9 0 08 Feb 2022
Exploring Self-Attention Mechanisms for Speech Separation Cem Subakan Mirco Ravanelli Samuele Cornell François Grondin Mirko Bronzi 32 23 0 06 Feb 2022
Active Audio-Visual Separation of Dynamic Sound Sources Sagnik Majumder Kristen Grauman 19 21 0 02 Feb 2022
New Insights on Target Speaker Extraction Mohamed Elminshawi Wolfgang Mack Srikanth Raj Chetupalli Soumitro Chakrabarty Emanuel Habets 11 18 0 01 Feb 2022
SkiM: Skipping Memory LSTM for Low-Latency Real-Time Continuous Speech Separation Chenda Li Lei Yang Weiqin Wang Y. Qian 29 24 0 26 Jan 2022
Fostering the Robustness of White-Box Deep Neural Network Watermarks by Neuron Alignment Fangqi Li Shi-Lin Wang Yun Zhu 27 13 0 28 Dec 2021
Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech Rohit Paturi S. Srinivasan Katrin Kirchhoff Daniel Garcia-Romero 17 9 0 10 Dec 2021
A Time-domain Real-valued Generalized Wiener Filter for Multi-channel Neural Separation Systems Yi Luo 29 14 0 07 Dec 2021
A Survey of Deep Learning for Low-Shot Object Detection Qihan Huang Haofei Zhang Mengqi Xue Jie Song Xiuming Zhang ObjD 33 18 0 06 Dec 2021
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network Xiaolin Hu Kai Li Weiyi Zhang Yi Luo Jean-Marie Lemercier Timo Gerkmann 49 47 0 04 Dec 2021
BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement Sunwoo Kim Minje Kim 29 4 0 17 Nov 2021
Monaural source separation: From anechoic to reverberant environments Tobias Cord-Landwehr Christoph Boeddeker Thilo von Neumann Catalin Zorila R. Doddipatla Reinhold Haeb-Umbach 11 31 0 15 Nov 2021
MT3: Multi-Task Multitrack Music Transcription Josh Gardner Ian Simon Ethan Manilow Curtis Hawthorne Jesse Engel 29 94 0 04 Nov 2021
Cross-attention conformer for context modeling in speech enhancement for ASR A. Narayanan Chung-Cheng Chiu Tom O'Malley Quan Wang Yanzhang He 24 14 0 30 Oct 2021
SA-SDR: A novel loss function for separation of meeting style data Thilo von Neumann K. Kinoshita Christoph Boeddeker Marc Delcroix Reinhold Haeb-Umbach 29 20 0 29 Oct 2021