Title
Learning Filterbanks for End-to-End Acoustic Beamforming Samuele Cornell Manuel Pariente François Grondin S. Squartini 30 7 0 08 Nov 2021
TorchAudio: Building Blocks for Audio and Speech Processing Yao-Yuan Yang Moto Hira Zhaoheng Ni Anjali Chourdia Artyom Astafurov ... Sean Narenthiran Shinji Watanabe Soumith Chintala Vincent Quenneville-Bélair Yangyang Shi 31 164 0 28 Oct 2021
Deep Learning Tools for Audacity: Helping Researchers Expand the Artist's Toolkit Hugo Flores Garcia Aldo Aguilar Ethan Manilow Dmitry Vedenko Bryan Pardo 14 2 0 25 Oct 2021
Progressive Learning for Stabilizing Label Selection in Speech Separation with Mapping-based Method Chenyang Gao Yue Gu I. Marsic 38 0 0 20 Oct 2021
ESPnet2-TTS: Extending the Edge of TTS Research Tomoki Hayashi Ryuichi Yamamoto Takenori Yoshimura Peter Wu Jiatong Shi Takaaki Saeki Yooncheol Ju Yusuke Yasuda Shinnosuke Takamichi Shinji Watanabe VLM 50 60 0 15 Oct 2021
Music Source Separation with Deep Equilibrium Models Yuichiro Koyama Naoki Murata Stefan Uhlich Giorgio Fabbro Shusuke Takahashi Yuki Mitsufuji 31 5 0 13 Oct 2021
End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression Karn N. Watcharasupat Thi Ngoc Tho Nguyen W. Gan Shengkui Zhao Bin Ma 33 12 0 02 Oct 2021
Noisy-to-Noisy Voice Conversion Framework with Denoising Model Chao Xie Yi-Chiao Wu Patrick Lumban Tobing Wen-Chin Huang T. Toda 18 7 0 22 Sep 2021
Don't Separate, Learn to Remix: End-to-End Neural Remixing with Joint Optimization Haici Yang Shivani Firodiya Nicholas J. Bryan Minje Kim 32 7 0 28 Jul 2021
Improving Reverberant Speech Separation with Multi-stage Training and Curriculum Learning R. Aralikatti Anton Ratnarajah Zhenyu Tang Tianyi Zhou 10 2 0 19 Jul 2021
Separation Guided Speaker Diarization in Realistic Mismatched Conditions Shu-Tong Niu Jun Du Lei Sun Chin-Hui Lee 6 4 0 06 Jul 2021
DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement Yuma Koizumi Shigeki Karita Scott Wisdom Hakan Erdogan J. Hershey Llion Jones M. Bacchiani 19 41 0 30 Jun 2021
Few-shot learning of new sound classes for target sound extraction Marc Delcroix Jorge Bennasar Vázquez Tsubasa Ochiai K. Kinoshita S. Araki VLM 21 11 0 14 Jun 2021
SpeechBrain: A General-Purpose Speech Toolkit Mirco Ravanelli Titouan Parcollet Peter William VanHarn Plantinga Aku Rouhe Samuele Cornell ... William Aris Hwidong Na Yan Gao R. Mori Yoshua Bengio 10 751 0 08 Jun 2021
Manifold-Aware Deep Clustering: Maximizing Angles between Embedding Vectors Based on Regular Simplex Keitaro Tanaka Ryosuke Sawata Shusuke Takahashi 17 0 0 04 Jun 2021
Test-Time Adaptation Toward Personalized Speech Enhancement: Zero-Shot Learning with Knowledge Distillation Sunwoo Kim Minje Kim 28 18 0 08 May 2021
Efficient Personalized Speech Enhancement through Self-Supervised Learning Aswin Sivaraman Minje Kim 18 19 0 05 Apr 2021
USTC-NELSLIP System Description for DIHARD-III Challenge Yuxuan Wang Maokui He Shutong Niu Lei Sun Tian Gao Xin Fang Jia Pan Jun Du Chin-Hui Lee 8 28 0 19 Mar 2021
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect Jun Wang Max W. Y. Lam Dan Su Dong Yu 22 6 0 02 Mar 2021
TransMask: A Compact and Fast Speech Separation Model Based on Transformer Zining Zhang Bingsheng He Zhenjie Zhang 21 21 0 19 Feb 2021
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans Shinji Watanabe Florian Boyer Xuankai Chang Pengcheng Guo Tomoki Hayashi ... Shigeki Karita Chenda Li Jing Shi Aswin Shanmugam Subramanian Wangyou Zhang VLM 39 38 0 23 Dec 2020
I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch Joseph P. Turian Max Henry 24 29 0 08 Dec 2020
ESPnet-se: end-to-end speech enhancement and separation toolkit designed for asr integration Chenda Li Jing Shi Wangyou Zhang Aswin Shanmugam Subramanian Xuankai Chang ... Moto Hira Tomoki Hayashi Christoph Boeddeker Zhuo Chen Shinji Watanabe VLM 31 81 0 07 Nov 2020
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training Sung-Feng Huang Shun-Po Chuang Da-Rong Liu Yi-Chen Chen Gene-Ping Yang Hung-yi Lee SSL 31 14 0 29 Oct 2020
Attention is All You Need in Speech Separation Cem Subakan Mirco Ravanelli Samuele Cornell Mirko Bronzi Jianyuan Zhong 45 537 0 25 Oct 2020
Towards Listening to 10 People Simultaneously: An Efficient Permutation Invariant Training of Audio Source Separation Using Sinkhorn's Algorithm Hideyuki Tachibana 18 14 0 22 Oct 2020
All for One and One for All: Improving Music Separation by Bridging Networks Ryosuke Sawata Stefan Uhlich Shusuke Takahashi Yuki Mitsufuji 13 47 0 08 Oct 2020
Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation Zhong-Qiu Wang Peidong Wang DeLiang Wang 24 88 0 04 Oct 2020
Exploring the time-domain deep attractor network with two-stream architectures in a reverberant environment Hangting Chen Pengyuan Zhang 11 6 0 01 Jul 2020