ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.14860
  4. Cited By
Rethinking Processing Distortions: Disentangling the Impact of Speech
  Enhancement Errors on Speech Recognition Performance

Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance

23 April 2024
Tsubasa Ochiai
Kazuma Iwamoto
Marc Delcroix
Rintaro Ikeshita
Hiroshi Sato
Shoko Araki
Shigeru Katagiri
ArXivPDFHTML

Papers citing "Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance"

27 / 27 papers shown
Title
Neural Target Speech Extraction: An Overview
Neural Target Speech Extraction: An Overview
Kateřina Žmolíková
Marc Delcroix
Tsubasa Ochiai
K. Kinoshita
JanHonza'' vCernocký
Dong Yu
45
89
0
31 Jan 2023
Tackling the Cocktail Fork Problem for Separation and Transcription of
  Real-World Soundtracks
Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks
Darius Petermann
Gordon Wichern
Aswin Shanmugam Subramanian
Zhong-Qiu Wang
Jonathan Le Roux
37
10
0
14 Dec 2022
Robust Speech Recognition via Large-Scale Weak Supervision
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
113
3,515
0
06 Dec 2022
Speaker Reinforcement Using Target Source Extraction for Robust
  Automatic Speech Recognition
Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition
Catalin Zorila
R. Doddipatla
34
11
0
09 May 2022
End-to-End Integration of Speech Recognition, Speech Enhancement, and
  Self-Supervised Learning Representation
End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Xuankai Chang
Takashi Maekaku
Yuya Fujita
Shinji Watanabe
VLM
68
45
0
01 Apr 2022
How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement
  Errors on ASR
How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement Errors on ASR
Kazuma Iwamoto
Tsubasa Ochiai
Marc Delcroix
Rintaro Ikeshita
Hiroshi Sato
S. Araki
S. Katagiri
37
59
0
18 Jan 2022
Learning to Enhance or Not: Neural Network-Based Switching of Enhanced
  and Observed Signals for Overlapping Speech Recognition
Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
K. Kinoshita
Naoyuki Kamo
Takafumi Moriya
53
27
0
11 Jan 2022
Reduction of Subjective Listening Effort for TV Broadcast Signals with
  Recurrent Neural Networks
Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks
Nils L. Westhausen
R. Huber
Hannah Baumgartner
Ragini Sinha
J. Rennies
B. Meyer
47
10
0
02 Nov 2021
Recent Advances in End-to-End Automatic Speech Recognition
Recent Advances in End-to-End Automatic Speech Recognition
Jinyu Li
VLM
86
364
0
02 Nov 2021
SNRi Target Training for Joint Speech Enhancement and Recognition
SNRi Target Training for Joint Speech Enhancement and Recognition
Yuma Koizumi
Shigeki Karita
A. Narayanan
S. Panchapagesan
M. Bacchiani
59
14
0
01 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
178
1,794
0
26 Oct 2021
Controlling the Remixing of Separated Dialogue with a Non-Intrusive
  Quality Estimate
Controlling the Remixing of Separated Dialogue with a Non-Intrusive Quality Estimate
Matteo Torcoli
Jouni Paulus
T. Kastner
C. Uhle
40
8
0
21 Jul 2021
Should We Always Separate?: Switching Between Enhanced and Observed
  Signals for Overlapping Speech Recognition
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition
Hiroshi Sato
Tsubasa Ochiai
Marc Delcroix
K. Kinoshita
Takafumi Moriya
Naoyuki Kamo
45
23
0
02 Jun 2021
Convolutive Transfer Function Invariant SDR training criteria for
  Multi-Channel Reverberant Speech Separation
Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation
Christoph Boeddeker
Wangyou Zhang
Tomohiro Nakatani
K. Kinoshita
Tsubasa Ochiai
Marc Delcroix
Naoyuki Kamo
Y. Qian
Reinhold Haeb-Umbach
39
30
0
30 Nov 2020
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device
  Speech Recognition
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition
Quan Wang
Ignacio López Moreno
Mert Saglam
K. Wilson
Alan Chiao
...
Yanzhang He
Wei Li
Jason W. Pelecanos
M. Nika
A. Gruenstein
VLM
49
85
0
09 Sep 2020
Multi-talker ASR for an unknown number of sources: Joint training of
  source counting, separation and ASR
Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Thilo von Neumann
Christoph Boeddeker
Lukas Drude
K. Kinoshita
Marc Delcroix
Tomohiro Nakatani
Reinhold Haeb-Umbach
39
41
0
04 Jun 2020
Asteroid: the PyTorch-based audio source separation toolkit for
  researchers
Asteroid: the PyTorch-based audio source separation toolkit for researchers
Manuel Pariente
Samuele Cornell
Joris Cosentino
S. Sivasankaran
Efthymios Tzinis
...
Juan M. Martín-Donas
David Ditter
Ariel Frank
Antoine Deleforge
Emmanuel Vincent
56
155
0
08 May 2020
Improving noise robust automatic speech recognition with single-channel
  time-domain enhancement network
Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
K. Kinoshita
Tsubasa Ochiai
Marc Delcroix
Tomohiro Nakatani
35
97
0
09 Mar 2020
SDR - half-baked or well done?
SDR - half-baked or well done?
F. Sánchez-Martínez
M. Esplà-Gomis
Hakan Erdogan
J. Hershey
125
1,180
0
06 Nov 2018
gpuRIR: A Python Library for Room Impulse Response Simulation with GPU
  Acceleration
gpuRIR: A Python Library for Room Impulse Response Simulation with GPU Acceleration
David Diaz-Guerra
A. Miguel
J. R. Beltrán
84
123
0
26 Oct 2018
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for
  Speech Separation
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Yi Luo
N. Mesgarani
124
1,772
0
20 Sep 2018
Performance Based Cost Functions for End-to-End Speech Separation
Performance Based Cost Functions for End-to-End Speech Separation
Shrikant Venkataramani
Ryley Higa
Paris Smaragdis
26
21
0
01 Jun 2018
The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset,
  task and baselines
The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines
Jon Barker
Shinji Watanabe
Emmanuel Vincent
J. Trmal
43
680
0
28 Mar 2018
Building state-of-the-art distant speech recognition using the CHiME-4
  challenge with a setup of speech enhancement baseline
Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline
Szu-Jui Chen
Aswin Shanmugam Subramanian
Hainan Xu
Shinji Watanabe
13
76
0
27 Mar 2018
Supervised Speech Separation Based on Deep Learning: An Overview
Supervised Speech Separation Based on Deep Learning: An Overview
DeLiang Wang
Jitong Chen
SSL
54
1,359
0
24 Aug 2017
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
880
149,474
0
22 Dec 2014
On the difficulty of training Recurrent Neural Networks
On the difficulty of training Recurrent Neural Networks
Razvan Pascanu
Tomas Mikolov
Yoshua Bengio
ODL
139
5,318
0
21 Nov 2012
1