ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.04132
  4. Cited By
Asteroid: the PyTorch-based audio source separation toolkit for
  researchers

Asteroid: the PyTorch-based audio source separation toolkit for researchers

8 May 2020
Manuel Pariente
Samuele Cornell
Joris Cosentino
S. Sivasankaran
Efthymios Tzinis
Jens Heitkaemper
Michel Olvera
Fabian-Robert Stöter
Mathieu Hu
Juan M. Martín-Donas
David Ditter
Ariel Frank
Antoine Deleforge
Emmanuel Vincent
ArXivPDFHTML

Papers citing "Asteroid: the PyTorch-based audio source separation toolkit for researchers"

50 / 79 papers shown
Title
SonicSieve: Bringing Directional Speech Extraction to Smartphones Using Acoustic Microstructures
SonicSieve: Bringing Directional Speech Extraction to Smartphones Using Acoustic Microstructures
Kuang Yuan
Yifeng Wang
Xiyuxing Zhang
Chengyi Shen
Swarun Kumar
Justin Chan
29
0
0
15 Apr 2025
Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems
Whispering Under the Eaves: Protecting User Privacy Against Commercial and LLM-powered Automatic Speech Recognition Systems
Weifei Jin
Yuxin Cao
Junjie Su
Derui Wang
Yedi Zhang
Minhui Xue
Jie Hao
Jin Song Dong
Yixian Yang
AAML
57
0
0
01 Apr 2025
Score-informed Music Source Separation: Improving Synthetic-to-real Generalization in Classical Music
Eetu Tunturi
David Diaz-Guerra
A. Politis
Tuomas Virtanen
43
0
0
10 Mar 2025
30+ Years of Source Separation Research: Achievements and Future Challenges
30+ Years of Source Separation Research: Achievements and Future Challenges
S. Araki
N. Ito
Reinhold Haeb-Umbach
G. Wichern
Zhong-Qiu Wang
Yuki Mitsufuji
AI4TS
39
0
0
21 Jan 2025
GhostRNN: Reducing State Redundancy in RNN with Cheap Operations
GhostRNN: Reducing State Redundancy in RNN with Cheap Operations
Hang Zhou
Xiaoxu Zheng
Yunhe Wang
Michael Bi Mi
Deyi Xiong
Kai Han
59
0
0
20 Nov 2024
Joint Beamforming and Speaker-Attributed ASR for Real Distant-Microphone
  Meeting Transcription
Joint Beamforming and Speaker-Attributed ASR for Real Distant-Microphone Meeting Transcription
Can Cui
Imran A. Sheikh
Mostafa Sadeghi
Emmanuel Vincent
34
0
0
29 Oct 2024
Enhancing Crowdsourced Audio for Text-to-Speech Models
Enhancing Crowdsourced Audio for Text-to-Speech Models
José Giraldo
Martí Llopart-Font
Alex Peiró-Lilja
Carme Armentano-Oller
Gerard Sant
Baybars Külebi
DiffM
26
0
0
17 Oct 2024
SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source
  Separation
SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source Separation
Jaime Garcia-Martinez
David Diaz-Guerra
A. Politis
Tuomas Virtanen
J. Carabias-Orti
P. Vera-Candeas
24
1
0
17 Sep 2024
ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration
ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration
Masao Someki
Kwanghee Choi
Siddhant Arora
William Chen
Samuele Cornell
Jionghao Han
Yifan Peng
Jiatong Shi
Vaibhav Srivastav
Shinji Watanabe
VLM
32
0
0
14 Sep 2024
DENSE: Dynamic Embedding Causal Target Speech Extraction
DENSE: Dynamic Embedding Causal Target Speech Extraction
Yiwen Wang
Zeyu Yuan
Xihong Wu
41
0
0
10 Sep 2024
Papez: Resource-Efficient Speech Separation with Auditory Working Memory
Papez: Resource-Efficient Speech Separation with Auditory Working Memory
Hyunseok Oh
Juheon Yi
Youngki Lee
19
2
0
01 Jul 2024
Effects of Dataset Sampling Rate for Noise Cancellation through Deep
  Learning
Effects of Dataset Sampling Rate for Noise Cancellation through Deep Learning
Brandon Colelough
Andrew Zheng
24
1
0
30 May 2024
Look Once to Hear: Target Speech Hearing with Noisy Examples
Look Once to Hear: Target Speech Hearing with Noisy Examples
Bandhav Veluri
Malek Itani
Tuochao Chen
Takuya Yoshioka
Shyamnath Gollakota
38
14
0
10 May 2024
Rethinking Processing Distortions: Disentangling the Impact of Speech
  Enhancement Errors on Speech Recognition Performance
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
Tsubasa Ochiai
Kazuma Iwamoto
Marc Delcroix
Rintaro Ikeshita
Hiroshi Sato
Shoko Araki
Shigeru Katagiri
29
2
0
23 Apr 2024
CATSE: A Context-Aware Framework for Causal Target Sound Extraction
CATSE: A Context-Aware Framework for Causal Target Sound Extraction
Shrishail Baligar
M. Kegler
Bryce Irvin
Marko Stamenovic
Shawn Newsam
33
0
0
21 Mar 2024
Online speaker diarization of meetings guided by speech separation
Online speaker diarization of meetings guided by speech separation
Elio Gruttadauria
Mathieu Fontaine
S. Essid
17
4
0
30 Jan 2024
w2v-SELD: A Sound Event Localization and Detection Framework for
  Self-Supervised Spatial Audio Pre-Training
w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training
Orlem Lima dos Santos
Karen Rosero
R. Lotufo
SSL
19
2
0
12 Dec 2023
Improving Label Assignments Learning by Dynamic Sample Dropout Combined
  with Layer-wise Optimization in Speech Separation
Improving Label Assignments Learning by Dynamic Sample Dropout Combined with Layer-wise Optimization in Speech Separation
Chenyu Gao
Yue Gu
I. Marsic
18
0
0
20 Nov 2023
Zero-Shot Duet Singing Voices Separation with Diffusion Models
Zero-Shot Duet Singing Voices Separation with Diffusion Models
Chin-Yun Yu
Emilian Postolache
Emanuele Rodolà
Gyorgy Fazekas
DiffM
15
3
0
13 Nov 2023
Deep Audio Analyzer: a Framework to Industrialize the Research on Audio
  Forensics
Deep Audio Analyzer: a Framework to Industrialize the Research on Audio Forensics
Valerio Francesco Puglisi
O. Giudice
Sebastiano Battiato
19
1
0
29 Oct 2023
Refining DNN-based Mask Estimation using CGMM-based EM Algorithm for
  Multi-channel Noise Reduction
Refining DNN-based Mask Estimation using CGMM-based EM Algorithm for Multi-channel Noise Reduction
Julitta Bartolewska
Stanisław Kacprzak
K. Kowalczyk
15
0
0
18 Sep 2023
Causal Signal-Based DCCRN with Overlapped-Frame Prediction for Online
  Speech Enhancement
Causal Signal-Based DCCRN with Overlapped-Frame Prediction for Online Speech Enhancement
Julitta Bartolewska
Stanisław Kacprzak
K. Kowalczyk
16
2
0
07 Sep 2023
Remixing-based Unsupervised Source Separation from Scratch
Remixing-based Unsupervised Source Separation from Scratch
Kohei Saijo
Tetsuji Ogawa
11
3
0
01 Sep 2023
Automatic Data Augmentation for Domain Adapted Fine-Tuning of
  Self-Supervised Speech Representations
Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations
Salah Zaiem
Titouan Parcollet
S. Essid
38
2
0
01 Jun 2023
End-to-End Integration of Speech Separation and Voice Activity Detection
  for Low-Latency Diarization of Telephone Conversations
End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations
Giovanni Morrone
Samuele Cornell
L. Serafini
Enrico Zovato
A. Brutti
S. Squartini
23
4
0
21 Mar 2023
A Multimodal Data-driven Framework for Anxiety Screening
A Multimodal Data-driven Framework for Anxiety Screening
Haimiao Mo
Shuai Ding
Siu Cheung Hui
18
6
0
16 Mar 2023
On Neural Architectures for Deep Learning-based Source Separation of
  Co-Channel OFDM Signals
On Neural Architectures for Deep Learning-based Source Separation of Co-Channel OFDM Signals
Gary C. F. Lee
Amir Weiss
A. Lancho
Yury Polyanskiy
G. Wornell
AI4TS
17
6
0
11 Mar 2023
Scaling strategies for on-device low-complexity source separation with
  Conv-Tasnet
Scaling strategies for on-device low-complexity source separation with Conv-Tasnet
Mohamed Nabih Ali
Francesco Paissan
Daniele Falavigna
A. Brutti
15
2
0
06 Mar 2023
audb -- Sharing and Versioning of Audio and Annotation Data in Python
audb -- Sharing and Versioning of Audio and Annotation Data in Python
H. Wierstorf
Johannes Wagner
F. Eyben
Felix Burkhardt
Björn W. Schuller
30
1
0
01 Mar 2023
An Audio-Visual Speech Separation Model Inspired by
  Cortico-Thalamo-Cortical Circuits
An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits
Kai Li
Fenghua Xie
Hang Chen
K. Yuan
Xiaolin Hu
29
14
0
21 Dec 2022
The Potential of Neural Speech Synthesis-based Data Augmentation for
  Personalized Speech Enhancement
The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement
Anastasia Kuznetsova
Aswin Sivaraman
Minje Kim
29
3
0
14 Nov 2022
How to Leverage DNN-based speech enhancement for multi-channel speaker
  verification?
How to Leverage DNN-based speech enhancement for multi-channel speaker verification?
Sandipana Dowerah
Romain Serizel
D. Jouvet
Mohammad MohammadAmini
D. Matrouf
14
2
0
17 Oct 2022
spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid
  filtering for multi-channel speech enhancement
spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancement
Shubo Lv
Yihui Fu
Yukai Jv
Linfu Xie
Weixin Zhu
Wei Rao
Yannan Wang
14
8
0
17 Oct 2022
Can we use Common Voice to train a Multi-Speaker TTS system?
Can we use Common Voice to train a Multi-Speaker TTS system?
Sewade Ogun
Vincent Colotte
Emmanuel Vincent
27
10
0
12 Oct 2022
Analysis of impact of emotions on target speech extraction and speech
  separation
Analysis of impact of emotions on target speech extraction and speech separation
Jan vSvec
Katevrina vZmolíková
M. Kocour
Marc Delcroix
Tsubasa Ochiai
Ladislav Movsner
JanHonza'' vCernocký
15
4
0
15 Aug 2022
Multimodal Emotion Recognition with Modality-Pairwise Unsupervised
  Contrastive Loss
Multimodal Emotion Recognition with Modality-Pairwise Unsupervised Contrastive Loss
Riccardo Franceschini
Enrico Fini
Cigdem Beyan
Alessandro Conti
F. Arrigoni
Elisa Ricci
SSL
OffRL
34
16
0
23 Jul 2022
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition,
  Translation, and Understanding
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding
Yen-Ju Lu
Xuankai Chang
Chenda Li
Wangyou Zhang
Samuele Cornell
...
Robin Scheibler
Zhong-Qiu Wang
Yu Tsao
Y. Qian
Shinji Watanabe
VLM
19
28
0
19 Jul 2022
PodcastMix: A dataset for separating music and speech in podcasts
PodcastMix: A dataset for separating music and speech in podcasts
Nico M. Schmidt
Jordi Pons
M. Miron
19
2
0
15 Jul 2022
An Evaluation of Three-Stage Voice Conversion Framework for Noisy and
  Reverberant Conditions
An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions
Yeonjong Choi
Chao Xie
T. Toda
DiffM
25
2
0
30 Jun 2022
MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D
  Scenes
MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes
Anton Ratnarajah
Zhenyu Tang
R. Aralikatti
Tianyi Zhou
AI4CE
20
36
0
18 May 2022
Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker
  Extraction
Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction
Zifeng Zhao
Rongzhi Gu
Dongchao Yang
Jinchuan Tian
Yuexian Zou
22
2
0
15 Apr 2022
RadioSES: mmWave-Based Audioradio Speech Enhancement and Separation
  System
RadioSES: mmWave-Based Audioradio Speech Enhancement and Separation System
M. Z. Ozturk
Chenshu Wu
Beibei Wang
Min Wu
K. Liu
21
20
0
14 Apr 2022
Listen only to me! How well can target speech extraction handle false
  alarms?
Listen only to me! How well can target speech extraction handle false alarms?
Marc Delcroix
K. Kinoshita
Tsubasa Ochiai
Kateřina Žmolíková
Hiroshi Sato
Tomohiro Nakatani
26
15
0
11 Apr 2022
SoundBeam: Target sound extraction conditioned on sound-class labels and
  enrollment clues for increased performance and continuous learning
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
Yasunori Ohishi
S. Araki
VLM
22
31
0
08 Apr 2022
GWA: A Large High-Quality Acoustic Dataset for Audio Processing
GWA: A Large High-Quality Acoustic Dataset for Audio Processing
Zhenyu Tang
R. Aralikatti
Anton Ratnarajah
Tianyi Zhou
29
31
0
04 Apr 2022
Disentangling the Impacts of Language and Channel Variability on Speech
  Separation Networks
Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks
Fan Wang
Hung-Shin Lee
Yu Tsao
Hsin-Min Wang
21
4
0
30 Mar 2022
How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement
  Errors on ASR
How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement Errors on ASR
Kazuma Iwamoto
Tsubasa Ochiai
Marc Delcroix
Rintaro Ikeshita
Hiroshi Sato
S. Araki
S. Katagiri
22
57
0
18 Jan 2022
Directed Speech Separation for Automatic Speech Recognition of Long Form
  Conversational Speech
Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech
Rohit Paturi
S. Srinivasan
Katrin Kirchhoff
Daniel Garcia-Romero
17
9
0
10 Dec 2021
Speech Separation Using an Asynchronous Fully Recurrent Convolutional
  Neural Network
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
Xiaolin Hu
Kai Li
Weiyi Zhang
Yi Luo
Jean-Marie Lemercier
Timo Gerkmann
49
47
0
04 Dec 2021
BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable
  and Efficient Speech Enhancement
BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement
Sunwoo Kim
Minje Kim
29
4
0
17 Nov 2021
12
Next