ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.05843
12
34

Real-time speech enhancement using equilibriated RNN

14 February 2020
Daiki Takeuchi
Kohei Yatabe
Yuma Koizumi
Yasuhiro Oikawa
N. Harada
ArXivPDFHTML
Abstract

We propose a speech enhancement method using a causal deep neural network~(DNN) for real-time applications. DNN has been widely used for estimating a time-frequency~(T-F) mask which enhances a speech signal. One popular DNN structure for that is a recurrent neural network~(RNN) owing to its capability of effectively modelling time-sequential data like speech. In particular, the long short-term memory (LSTM) is often used to alleviate the vanishing/exploding gradient problem which makes the training of an RNN difficult. However, the number of parameters of LSTM is increased as the price of mitigating the difficulty of training, which requires more computational resources. For real-time speech enhancement, it is preferable to use a smaller network without losing the performance. In this paper, we propose to use the equilibriated recurrent neural network~(ERNN) for avoiding the vanishing/exploding gradient problem without increasing the number of parameters. The proposed structure is causal, which requires only the information from the past, in order to apply it in real-time. Compared to the uni- and bi-directional LSTM networks, the proposed method achieved the similar performance with much fewer parameters.

View on arXiv
Comments on this paper