ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.07454
  4. Cited By
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for
  Speech Separation
v1v2v3 (latest)

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

20 September 2018
Yi Luo
N. Mesgarani
ArXiv (abs)PDFHTML

Papers citing "Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation"

50 / 773 papers shown
Title
Receptive Field Analysis of Temporal Convolutional Networks for Monaural
  Speech Dereverberation
Receptive Field Analysis of Temporal Convolutional Networks for Monaural Speech Dereverberation
William Ravenscroft
Stefan Goetze
Thomas Hain
24
8
0
13 Apr 2022
Listen only to me! How well can target speech extraction handle false
  alarms?
Listen only to me! How well can target speech extraction handle false alarms?
Marc Delcroix
K. Kinoshita
Tsubasa Ochiai
Kateřina Žmolíková
Hiroshi Sato
Tomohiro Nakatani
71
15
0
11 Apr 2022
SoundBeam: Target sound extraction conditioned on sound-class labels and
  enrollment clues for increased performance and continuous learning
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
Marc Delcroix
Jorge Bennasar Vázquez
Tsubasa Ochiai
K. Kinoshita
Yasunori Ohishi
S. Araki
VLM
85
34
0
08 Apr 2022
Defense against Adversarial Attacks on Hybrid Speech Recognition using
  Joint Adversarial Fine-tuning with Denoiser
Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Sonal Joshi
Saurabh Kataria
Yiwen Shao
Piotr Żelasko
Jesus Villalba
Sanjeev Khudanpur
Najim Dehak
AAML
42
4
0
08 Apr 2022
AdvEst: Adversarial Perturbation Estimation to Classify and Detect
  Adversarial Attacks against Speaker Identification
AdvEst: Adversarial Perturbation Estimation to Classify and Detect Adversarial Attacks against Speaker Identification
Sonal Joshi
Saurabh Kataria
Jesus Villalba
Najim Dehak
AAML
86
7
0
08 Apr 2022
Audio-visual multi-channel speech separation, dereverberation and
  recognition
Audio-visual multi-channel speech separation, dereverberation and recognition
Guinan Li
Jianwei Yu
Jiajun Deng
Xunying Liu
Helen Meng
73
7
0
05 Apr 2022
Target Confusion in End-to-end Speaker Extraction: Analysis and
  Approaches
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches
Zifeng Zhao
Dongchao Yang
Rongzhi Gu
Haoran Zhang
Yuexian Zou
44
19
0
04 Apr 2022
tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using
  a Short Temporal Context
tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using a Short Temporal Context
Nils L. Westhausen
B. Meyer
73
8
0
04 Apr 2022
Improving Target Sound Extraction with Timestamp Information
Improving Target Sound Extraction with Timestamp Information
Helin Wang
Dongchao Yang
Chao Weng
Jianwei Yu
Yuexian Zou
64
10
0
02 Apr 2022
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement
  Network (E3Net) and Knowledge Distillation
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation
Manthan Thakker
Sefik Emre Eskimez
Takuya Yoshioka
Huaming Wang
45
30
0
02 Apr 2022
End-to-End Integration of Speech Recognition, Speech Enhancement, and
  Self-Supervised Learning Representation
End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Xuankai Chang
Takashi Maekaku
Yuya Fujita
Shinji Watanabe
VLM
111
46
0
01 Apr 2022
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech
  Separation for Flexible Number of Speakers
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Soumi Maiti
Yushi Ueda
Shinji Watanabe
Chunlei Zhang
Meng Yu
Shi-Xiong Zhang
Yong-mei Xu
94
33
0
31 Mar 2022
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain
  Target Speaker Extraction
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Zexu Pan
Meng Ge
Haizhou Li
72
20
0
31 Mar 2022
Speaker Extraction with Co-Speech Gestures Cue
Speaker Extraction with Co-Speech Gestures Cue
Zexu Pan
Xinyuan Qian
Haizhou Li
SLR
66
29
0
31 Mar 2022
A Comparative Study on Speaker-attributed Automatic Speech Recognition
  in Multi-party Meetings
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings
Fan Yu
Zhihao Du
Shiliang Zhang
Yuxiao Lin
Linfu Xie
42
15
0
31 Mar 2022
Joint domain adaptation and speech bandwidth extension using time-domain
  GANs for speaker verification
Joint domain adaptation and speech bandwidth extension using time-domain GANs for speaker verification
Saurabh Kataria
Jesús Villalba
Laureano Moro-Velazquez
Najim Dehak
39
3
0
30 Mar 2022
Phase-Aware Deep Speech Enhancement: It's All About The Frame Length
Phase-Aware Deep Speech Enhancement: It's All About The Frame Length
Tal Peer
Timo Gerkmann
55
21
0
30 Mar 2022
Coarse-to-Fine Recursive Speech Separation for Unknown Number of
  Speakers
Coarse-to-Fine Recursive Speech Separation for Unknown Number of Speakers
Zhenhao Jin
Xiang Hao
Xiangdong Su
55
4
0
30 Mar 2022
Disentangling the Impacts of Language and Channel Variability on Speech
  Separation Networks
Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks
Fan Wang
Hung-Shin Lee
Yu Tsao
Hsin-Min Wang
51
5
0
30 Mar 2022
Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for
  Real-Time Full-Band Speech Enhancement
Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement
Guochen Yu
Andong Li
Wenzhe Liu
C. Zheng
Yutian Wang
Haibo Wang
86
4
0
30 Mar 2022
DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level
  and Utterance-Level Acoustic Representation Learning
DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning
Takaaki Saeki
Kentaro Tachibana
Ryuichi Yamamoto
53
11
0
29 Mar 2022
Separate What You Describe: Language-Queried Audio Source Separation
Separate What You Describe: Language-Queried Audio Source Separation
Xubo Liu
Haohe Liu
Qiuqiang Kong
Xinhao Mei
Jinzheng Zhao
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
100
70
0
28 Mar 2022
Embedding Recurrent Layers with Dual-Path Strategy in a Variant of
  Convolutional Network for Speaker-Independent Speech Separation
Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation
Xue Yang
C. Bao
51
3
0
25 Mar 2022
SelfRemaster: Self-Supervised Speech Restoration with
  Analysis-by-Synthesis Approach Using Channel Modeling
SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Takaaki Saeki
Shinnosuke Takamichi
Tomohiko Nakamura
Naoko Tanji
Hiroshi Saruwatari
77
6
0
24 Mar 2022
FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for
  Speech Enhancement
FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement
Jun Chen
Zehao Wang
Deyi Tuo
Zhiyong Wu
Shiyin Kang
Helen Meng
86
111
0
23 Mar 2022
Joint Noise Reduction and Listening Enhancement for Full-End Speech
  Enhancement
Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement
Haoyu Li
Yun Liu
Junichi Yamagishi
27
2
0
22 Mar 2022
RoSS: Utilizing Robotic Rotation for Audio Source Separation
RoSS: Utilizing Robotic Rotation for Audio Source Separation
Hyungjoo Seo
Sahil Bhandary Karnoor
Romit Roy Choudhury
89
0
0
18 Mar 2022
A Squeeze-and-Excitation and Transformer based Cross-task System for
  Environmental Sound Recognition
A Squeeze-and-Excitation and Transformer based Cross-task System for Environmental Sound Recognition
Jisheng Bai
Jianfeng Chen
Mou Wang
Muhammad Saad Ayub
56
9
0
16 Mar 2022
MDNet: Learning Monaural Speech Enhancement from Deep Prior Gradient
MDNet: Learning Monaural Speech Enhancement from Deep Prior Gradient
Andong Li
C. Zheng
Ziyang Zhang
Xiaodong Li
101
3
0
14 Mar 2022
Improving the transferability of speech separation by meta-learning
Improving the transferability of speech separation by meta-learning
Kuan-Po Huang
Yuan-Kuei Wu
Hung-yi Lee
69
1
0
11 Mar 2022
Harmonicity Plays a Critical Role in DNN Based Versus in
  Biologically-Inspired Monaural Speech Segregation Systems
Harmonicity Plays a Critical Role in DNN Based Versus in Biologically-Inspired Monaural Speech Segregation Systems
Rahil Parikh
Ilya Kavalerov
C. Espy-Wilson
Shihab Shamma Institute for Systems Research
38
3
0
08 Mar 2022
Single microphone speaker extraction using unified time-frequency
  Siamese-Unet
Single microphone speaker extraction using unified time-frequency Siamese-Unet
Aviad Eisenberg
Sharon Gannot
Shlomo E. Chazan
51
3
0
06 Mar 2022
Integrating Statistical Uncertainty into Neural Network-Based Speech
  Enhancement
Integrating Statistical Uncertainty into Neural Network-Based Speech Enhancement
Hu Fang
Tal Peer
S. Wermter
Timo Gerkmann
71
6
0
04 Mar 2022
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker
  Detection and Speech Enhancement
Look\&Listen: Multi-Modal Correlation Learning for Active Speaker Detection and Speech Enhancement
Jun Xiong
Yu Zhou
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
70
22
0
04 Mar 2022
DMF-Net: A decoupling-style multi-band fusion model for full-band speech
  enhancement
DMF-Net: A decoupling-style multi-band fusion model for full-band speech enhancement
Guochen Yu
Yuansheng Guan
Weixin Meng
C. Zheng
Haibo Wang
96
2
0
01 Mar 2022
Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE
  Submission to The L3DAS22 Challenge
Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge
Yen-Ju Lu
Samuele Cornell
Xuankai Chang
Wangyou Zhang
Chenda Li
Zhaoheng Ni
Zhong-Qiu Wang
Shinji Watanabe
65
29
0
24 Feb 2022
Benchmarking Generative Latent Variable Models for Speech
Benchmarking Generative Latent Variable Models for Speech
Jakob Drachmann Havtorn
Lasse Borgholt
Søren Hauberg
J. Frellsen
Lars Maaløe
66
3
0
22 Feb 2022
L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office
  Environment
L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment
E. Guizzo
Christian Marinoni
Marco Pennese
Xinlei Ren
Xiguang Zheng
Chen Zhang
Bruno Masiero
A. Uncini
Danilo Comminiello
113
54
0
21 Feb 2022
L-SpEx: Localized Target Speaker Extraction
L-SpEx: Localized Target Speaker Extraction
Meng Ge
Chenglin Xu
Longbiao Wang
Eng Siong Chng
Jianwu Dang
Haizhou Li
52
24
0
21 Feb 2022
Multi-Channel Speech Denoising for Machine Ears
Multi-Channel Speech Denoising for Machine Ears
Cong Han
Emine Merve Kaya
Kyle Hoefer
M. Slaney
S. Carlile
35
2
0
17 Feb 2022
On loss functions and evaluation metrics for music source separation
On loss functions and evaluation metrics for music source separation
Enric Gusó
Jordi Pons
Santiago Pascual
Joan Serrà
132
21
0
16 Feb 2022
DBT-Net: Dual-branch federative magnitude and phase estimation with
  attention-in-attention transformer for monaural speech enhancement
DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement
Guochen Yu
Andong Li
Hui Wang
Yutian Wang
Yuxuan Ke
C. Zheng
82
37
0
16 Feb 2022
Speech Denoising in the Waveform Domain with Self-Attention
Speech Denoising in the Waveform Domain with Self-Attention
Zhifeng Kong
Ming-Yu Liu
Ambrish Dantrey
Bryan Catanzaro
89
63
0
15 Feb 2022
Conditional Diffusion Probabilistic Model for Speech Enhancement
Conditional Diffusion Probabilistic Model for Speech Enhancement
Yen-Ju Lu
Zhongqiu Wang
Shinji Watanabe
Alexander Richard
Cheng Yu
Yu Tsao
DiffM
79
191
0
10 Feb 2022
Royalflush Speaker Diarization System for ICASSP 2022 Multi-channel
  Multi-party Meeting Transcription Challenge
Royalflush Speaker Diarization System for ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge
Jingguang Tian
Xinhui Hu
Xinkang Xu
58
9
0
10 Feb 2022
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation
  Invariant Training
MixCycle: Unsupervised Speech Separation via Cyclic Mixture Permutation Invariant Training
Ertuğ Karamatlı
S. Kırbız
SSL
89
10
0
08 Feb 2022
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting
  Transcription Grand Challenge
Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
Fan Yu
Shiliang Zhang
Pengcheng Guo
Yihui Fu
Zhihao Du
...
Kong Aik Lee
Zhijie Yan
B. Ma
Xin Xu
Hui Bu
59
28
0
08 Feb 2022
Exploring Self-Attention Mechanisms for Speech Separation
Exploring Self-Attention Mechanisms for Speech Separation
Cem Subakan
Mirco Ravanelli
Samuele Cornell
François Grondin
Mirko Bronzi
75
23
0
06 Feb 2022
The CUHK-TENCENT speaker diarization system for the ICASSP 2022
  multi-channel multi-party meeting transcription challenge
The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge
Naijun Zheng
Na Li
Xixin Wu
Lingwei Meng
Jiawen Kang
Haibin Wu
Chao Weng
Dan Su
Helen Meng
73
10
0
04 Feb 2022
New Insights on Target Speaker Extraction
New Insights on Target Speaker Extraction
Mohamed Elminshawi
Wolfgang Mack
Srikanth Raj Chetupalli
Soumitro Chakrabarty
Emanuel Habets
58
18
0
01 Feb 2022
Previous
123...8910...141516
Next