ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.07454
  4. Cited By
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for
  Speech Separation
v1v2v3 (latest)

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

20 September 2018
Yi Luo
N. Mesgarani
ArXiv (abs)PDFHTML

Papers citing "Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation"

50 / 773 papers shown
Title
Papez: Resource-Efficient Speech Separation with Auditory Working Memory
Papez: Resource-Efficient Speech Separation with Auditory Working Memory
Hyunseok Oh
Juheon Yi
Youngki Lee
77
3
0
01 Jul 2024
Open-Source Conversational AI with SpeechBrain 1.0
Open-Source Conversational AI with SpeechBrain 1.0
Mirco Ravanelli
Titouan Parcollet
Adel Moumen
Sylvain de Langen
Cem Subakan
...
Salima Mdhaffar
G. Laperriere
Mickael Rouvier
Renato De Mori
Yannick Esteve
VLM
144
16
0
29 Jun 2024
SNR-Progressive Model with Harmonic Compensation for Low-SNR Speech
  Enhancement
SNR-Progressive Model with Harmonic Compensation for Low-SNR Speech Enhancement
Zhongshu Hou
Tong Lei
Qinwen Hu
Zhanzhong Cao
Ming Tang
Jing Lu
78
2
0
24 Jun 2024
Improved Remixing Process for Domain Adaptation-Based Speech Enhancement
  by Mitigating Data Imbalance in Signal-to-Noise Ratio
Improved Remixing Process for Domain Adaptation-Based Speech Enhancement by Mitigating Data Imbalance in Signal-to-Noise Ratio
Li Li
Shogo Seki
66
0
0
20 Jun 2024
Diffusion-based Generative Modeling with Discriminative Guidance for
  Streamable Speech Enhancement
Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement
Chenda Li
Samuele Cornell
Shinji Watanabe
Yanmin Qian
DiffM
88
2
0
19 Jun 2024
Universal Score-based Speech Enhancement with High Content Preservation
Universal Score-based Speech Enhancement with High Content Preservation
Robin Scheibler
Yusuke Fujita
Yuma Shirahata
Tatsuya Komatsu
DiffM
110
15
0
18 Jun 2024
AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech
  Separation By Leveraging Narrow- and Cross-Band Modeling
AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Vahid Ahmadi Kalkhorani
Cheng Yu
Anurag Kumar
Ke Tan
Buye Xu
DeLiang Wang
92
1
0
17 Jun 2024
SMRU: Split-and-Merge Recurrent-based UNet for Acoustic Echo Cancellation and Noise Suppression
SMRU: Split-and-Merge Recurrent-based UNet for Acoustic Echo Cancellation and Noise Suppression
Zhihang Sun
Andong Li
Rilin Chen
Hao Zhang
Meng Yu
Yi Zhou
Dong Yu
138
0
0
17 Jun 2024
Joint Speaker Features Learning for Audio-visual Multichannel Speech
  Separation and Recognition
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Guinan Li
Jiajun Deng
Youjun Chen
Mengzhe Geng
Shujie Hu
...
Zengrui Jin
Tianzi Wang
Xurong Xie
Helen Meng
Xunying Liu
VLM
56
0
0
14 Jun 2024
TSE-PI: Target Sound Extraction under Reverberant Environments with
  Pitch Information
TSE-PI: Target Sound Extraction under Reverberant Environments with Pitch Information
Yiwen Wang
Xihong Wu
70
2
0
13 Jun 2024
Target Speaker Extraction with Curriculum Learning
Target Speaker Extraction with Curriculum Learning
Yun Liu
Xuechen Liu
Xiaoxiao Miao
Junichi Yamagishi
61
3
0
12 Jun 2024
Pre-training Feature Guided Diffusion Model for Speech Enhancement
Pre-training Feature Guided Diffusion Model for Speech Enhancement
Yiyuan Yang
Niki Trigoni
Andrew Markham
156
3
0
11 Jun 2024
RaD-Net 2: A causal two-stage repairing and denoising speech enhancement
  network with knowledge distillation and complex axial self-attention
RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention
Mingshuai Liu
Zhuangqi Chen
Xiaopeng Yan
Yuanjun Lv
Xianjun Xia
Chuanzeng Huang
Yijian Xiao
Lei Xie
78
4
0
11 Jun 2024
MR-RawNet: Speaker verification system with multiple temporal
  resolutions for variable duration utterances using raw waveforms
MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Seung-bin Kim
Chan-yeong Lim
Jungwoo Heo
Ju-ho Kim
Hyun-Seo Shin
Kyo-Won Koo
Ha-Jin Yu
85
0
0
11 Jun 2024
Unsupervised Improved MVDR Beamforming for Sound Enhancement
Unsupervised Improved MVDR Beamforming for Sound Enhancement
Jacob Kealey
John Hershey
François Grondin
50
0
0
10 Jun 2024
Towards Signal Processing In Large Language Models
Towards Signal Processing In Large Language Models
Prateek Verma
Mert Pilanci
87
3
0
10 Jun 2024
EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech
  Enhancement and Dereverberation
EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation
Julius Richter
Yi-Chiao Wu
Steven Krenn
Simon Welker
Bunlong Lay
Shinji Watanabe
Alexander Richard
Timo Gerkmann
79
28
0
10 Jun 2024
Thunder : Unified Regression-Diffusion Speech Enhancement with a Single
  Reverse Step using Brownian Bridge
Thunder : Unified Regression-Diffusion Speech Enhancement with a Single Reverse Step using Brownian Bridge
Thanapat Trachu
Chawan Piansaddhayanon
Ekapol Chuangsuwanich
72
3
0
10 Jun 2024
URGENT Challenge: Universality, Robustness, and Generalizability For
  Speech Enhancement
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement
Wangyou Zhang
Robin Scheibler
Kohei Saijo
Samuele Cornell
Chenda Li
...
Jan Pirklbauer
Marvin Sach
Shinji Watanabe
Tim Fingscheidt
Yanmin Qian
VLM
90
20
0
07 Jun 2024
Beyond Performance Plateaus: A Comprehensive Study on Scalability in
  Speech Enhancement
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Wangyou Zhang
Kohei Saijo
Jee-weon Jung
Chenda Li
Shinji Watanabe
Yanmin Qian
74
7
0
06 Jun 2024
The PESQetarian: On the Relevance of Goodhart's Law for Speech
  Enhancement
The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement
Danilo de Oliveira
Simon Welker
Julius Richter
Timo Gerkmann
74
7
0
05 Jun 2024
Effects of Dataset Sampling Rate for Noise Cancellation through Deep
  Learning
Effects of Dataset Sampling Rate for Noise Cancellation through Deep Learning
Brandon Colelough
Andrew Zheng
93
1
0
30 May 2024
A Near-Real-Time Processing Ego Speech Filtering Pipeline Designed for
  Speech Interruption During Human-Robot Interaction
A Near-Real-Time Processing Ego Speech Filtering Pipeline Designed for Speech Interruption During Human-Robot Interaction
Yue Li
Florian A. Kunneman
Koen V. Hindriks
64
2
0
22 May 2024
Look Once to Hear: Target Speech Hearing with Noisy Examples
Look Once to Hear: Target Speech Hearing with Noisy Examples
Bandhav Veluri
Malek Itani
Tuochao Chen
Takuya Yoshioka
Shyamnath Gollakota
90
17
0
10 May 2024
Embedded Distributed Inference of Deep Neural Networks: A Systematic
  Review
Embedded Distributed Inference of Deep Neural Networks: A Systematic Review
Federico Nicolás Peccia
Oliver Bringmann
90
0
0
06 May 2024
TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio
  and Bone Conduction Speech Super Resolution and Enhancement on Mobile and
  Wearable Platforms
TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms
Yueyuan Sui
Minghui Zhao
Junxi Xia
Xiaofan Jiang
S. Xia
Mamba
93
11
0
02 May 2024
Deep low-latency joint speech transmission and enhancement over a
  gaussian channel
Deep low-latency joint speech transmission and enhancement over a gaussian channel
Mohammad Bokaei
Jesper Jensen
Simon Doclo
Jan Østergaard
61
0
0
30 Apr 2024
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Ruijie Tao
Xinyuan Qian
Yidi Jiang
Junjie Li
Jiadong Wang
Haizhou Li
73
2
0
29 Apr 2024
Rethinking Processing Distortions: Disentangling the Impact of Speech
  Enhancement Errors on Speech Recognition Performance
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
Tsubasa Ochiai
Kazuma Iwamoto
Marc Delcroix
Rintaro Ikeshita
Hiroshi Sato
Shoko Araki
Shigeru Katagiri
77
3
0
23 Apr 2024
Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio
  Source Separation
Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation
Ye Bai
Chenxing Li
Hao Li
Yuanyuan Zhao
Xiaorui Wang
57
1
0
17 Apr 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
94
27
0
15 Apr 2024
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel
  Energy Normalisation (PCEN) to Noisy Conditions
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions
Hanyu Meng
V. Sethu
E. Ambikairajah
90
2
0
10 Apr 2024
Gull: A Generative Multifunctional Audio Codec
Gull: A Generative Multifunctional Audio Codec
Yi Luo
Jianwei Yu
Hangting Chen
Rongzhi Gu
Chao Weng
AuLLM
84
3
0
07 Apr 2024
SPMamba: State-space model is all you need in speech separation
SPMamba: State-space model is all you need in speech separation
Kai Li
Guo Chen
Mamba
89
26
0
02 Apr 2024
MambaMixer: Efficient Selective State Space Models with Dual Token and
  Channel Selection
MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection
Ali Behrouz
Michele Santacatterina
Ramin Zabih
124
33
0
29 Mar 2024
Dual-path Mamba: Short and Long-term Bidirectional Selective Structured
  State Space Models for Speech Separation
Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation
Xilin Jiang
Cong Han
N. Mesgarani
Mamba
104
49
0
27 Mar 2024
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover
  Strategy
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy
Wenxuan Wu
Xueyuan Chen
Xixin Wu
Haizhou Li
Helen M. Meng
56
3
0
24 Mar 2024
CATSE: A Context-Aware Framework for Causal Target Sound Extraction
CATSE: A Context-Aware Framework for Causal Target Sound Extraction
Shrishail Baligar
M. Kegler
Bryce Irvin
Marko Stamenovic
Shawn Newsam
70
0
0
21 Mar 2024
Multichannel Long-Term Streaming Neural Speech Enhancement for Static
  and Moving Speakers
Multichannel Long-Term Streaming Neural Speech Enhancement for Static and Moving Speakers
Changsheng Quan
Xiaofei Li
120
27
0
12 Mar 2024
Towards Decoupling Frontend Enhancement and Backend Recognition in
  Monaural Robust ASR
Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
Yufeng Yang
Ashutosh Pandey
DeLiang Wang
55
4
0
11 Mar 2024
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection
  with Spiking Neural Networks
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks
Qu Yang
Qianhui Liu
Nan Li
Meng Ge
Zeyang Song
Haizhou Li
68
5
0
09 Mar 2024
CrossNet: Leveraging Global, Cross-Band, Narrow-Band, and Positional
  Encoding for Single- and Multi-Channel Speaker Separation
CrossNet: Leveraging Global, Cross-Band, Narrow-Band, and Positional Encoding for Single- and Multi-Channel Speaker Separation
Vahid Ahmadi Kalkhorani
DeLiang Wang
74
3
0
06 Mar 2024
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by
  Magnitude Conditioning
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by Magnitude Conditioning
Kuan-Hsun Ho
J. Hung
Berlin Chen
59
0
0
04 Mar 2024
What do neural networks listen to? Exploring the crucial bands in Speech
  Enhancement using Sinc-convolution
What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution
Kuan-Hsun Ho
J. Hung
Berlin Chen
70
1
0
04 Mar 2024
Real-time Low-latency Music Source Separation using Hybrid
  Spectrogram-TasNet
Real-time Low-latency Music Source Separation using Hybrid Spectrogram-TasNet
Satvik Venkatesh
Arthur Benilov
Philip Coleman
Frederic Roskam
77
6
0
27 Feb 2024
SICRN: Advancing Speech Enhancement through State Space Model and
  Inplace Convolution Techniques
SICRN: Advancing Speech Enhancement through State Space Model and Inplace Convolution Techniques
Changjiang Zhao
Shulin He
Xueliang Zhang
64
7
0
22 Feb 2024
Unrestricted Global Phase Bias-Aware Single-channel Speech Enhancement
  with Conformer-based Metric GAN
Unrestricted Global Phase Bias-Aware Single-channel Speech Enhancement with Conformer-based Metric GAN
Shiqi Zhang
Zheng Qiu
Daiki Takeuchi
Noboru Harada
Shoji Makino
39
4
0
13 Feb 2024
Sound Source Separation Using Latent Variational Block-Wise
  Disentanglement
Sound Source Separation Using Latent Variational Block-Wise Disentanglement
Karim Helwani
M. Togami
Paris Smaragdis
Michael M. Goodwin
BDLDRL
87
1
0
08 Feb 2024
Listen, Chat, and Remix: Text-Guided Soundscape Remixing for Enhanced Auditory Experience
Listen, Chat, and Remix: Text-Guided Soundscape Remixing for Enhanced Auditory Experience
Xilin Jiang
Cong Han
Yinghao Aaron Li
N. Mesgarani
KELM
91
5
0
06 Feb 2024
Array Geometry-Robust Attention-Based Neural Beamformer for Moving
  Speakers
Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers
Marvin Tammen
Tsubasa Ochiai
Marc Delcroix
Tomohiro Nakatani
S. Araki
Simon Doclo
99
1
0
05 Feb 2024
Previous
123456...141516
Next