ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08435
  4. Cited By
Efficient Neural Audio Synthesis

Efficient Neural Audio Synthesis

23 February 2018
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
ArXivPDFHTML

Papers citing "Efficient Neural Audio Synthesis"

50 / 472 papers shown
Title
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern
  Architectures
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
Gabriel Synnaeve
Qiantong Xu
Jacob Kahn
Tatiana Likhomanenko
Edouard Grave
Vineel Pratap
Anuroop Sriram
Vitaliy Liptchinsky
R. Collobert
SSL
AI4TS
36
246
0
19 Nov 2019
Deep Long Audio Inpainting
Deep Long Audio Inpainting
Ya-Liang Chang
Kuan-Ying Lee
Po-Yu Wu
Hung-yi Lee
Winston H. Hsu
30
33
0
15 Nov 2019
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence
  Modelling
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
Ruizhe Zhao
Brian K. Vogel
Tanvir Ahmed
Wayne Luk
25
37
0
14 Nov 2019
What Do Compressed Deep Neural Networks Forget?
What Do Compressed Deep Neural Networks Forget?
Sara Hooker
Aaron Courville
Gregory Clark
Yann N. Dauphin
Andrea Frome
19
181
0
13 Nov 2019
A unified sequence-to-sequence front-end model for Mandarin
  text-to-speech synthesis
A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis
Junjie Pan
Xiang Yin
Zhiling Zhang
Shichao Liu
Yang Zhang
Zejun Ma
Yuxuan Wang
9
26
0
11 Nov 2019
Towards Fine-Grained Prosody Control for Voice Conversion
Towards Fine-Grained Prosody Control for Voice Conversion
Zheng Lian
Zhengqi Wen
6
18
0
24 Oct 2019
Fast and High-Quality Singing Voice Synthesis System based on
  Convolutional Neural Networks
Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Kazuhiro Nakamura
Shinji Takaki
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
19
19
0
24 Oct 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source
  End-to-End Text-to-Speech Toolkit
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Tomoki Hayashi
Ryuichi Yamamoto
Katsuki Inoue
Takenori Yoshimura
Shinji Watanabe
T. Toda
K. Takeda
Yu Zhang
Xu Tan
VLM
29
202
0
24 Oct 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech
  Synthesis
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
33
113
0
23 Oct 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform
  Synthesis
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
16
938
0
08 Oct 2019
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Raza Habib
Soroosh Mariooryad
Matt Shannon
Eric Battenberg
RJ Skerry-Ryan
Daisy Stanton
David Kao
Tom Bagby
BDL
13
48
0
03 Oct 2019
Attention Forcing for Sequence-to-sequence Model Training
Attention Forcing for Sequence-to-sequence Model Training
Qingyun Dou
Yiting Lu
Joshua Efiong
Mark Gales
27
6
0
26 Sep 2019
Speech Recognition with Augmented Synthesized Speech
Speech Recognition with Augmented Synthesized Speech
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Ye Jia
Pedro J. Moreno
Yonghui Wu
Zelin Wu
32
127
0
25 Sep 2019
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
243
239
0
25 Sep 2019
Many-to-Many Voice Conversion using Cycle-Consistent Variational
  Autoencoder with Multiple Decoders
Many-to-Many Voice Conversion using Cycle-Consistent Variational Autoencoder with Multiple Decoders
Keonnyeong Lee
In-Chul Yoo
Dongsuk Yook
DRL
4
14
0
15 Sep 2019
DurIAN: Duration Informed Attention Network For Multimodal Synthesis
DurIAN: Duration Informed Attention Network For Multimodal Synthesis
Chengzhu Yu
Heng Lu
Na Hu
Meng Yu
Chao Weng
...
Deyi Tuo
Shiyin Kang
Guangzhi Lei
Dan Su
Dong Yu
CVBM
17
118
0
04 Sep 2019
Maximizing Mutual Information for Tacotron
Maximizing Mutual Information for Tacotron
Peng Liu
Xixin Wu
Shiyin Kang
Guangzhi Li
Dan Su
Dong Yu
22
16
0
30 Aug 2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech
  Synthesis and Cross-Language Voice Cloning
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Z. Chen
RJ Skerry-Ryan
Ye Jia
Andrew Rosenberg
Bhuvana Ramabhadran
14
188
0
09 Jul 2019
Speech bandwidth extension with WaveNet
Speech bandwidth extension with WaveNet
Archit Gupta
Brendan Shillingford
Yannis Assael
Thomas C. Walters
24
28
0
05 Jul 2019
A Methodology for Controlling the Emotional Expressiveness in Synthetic
  Speech -- a Deep Learning approach
A Methodology for Controlling the Emotional Expressiveness in Synthetic Speech -- a Deep Learning approach
Noé Tits
16
10
0
05 Jul 2019
Improving Performance of End-to-End ASR on Numeric Sequences
Improving Performance of End-to-End ASR on Numeric Sequences
Cal Peyser
Hao Zhang
Tara N. Sainath
Zelin Wu
AI4TS
19
36
0
01 Jul 2019
The Difficulty of Training Sparse Neural Networks
The Difficulty of Training Sparse Neural Networks
Utku Evci
Fabian Pedregosa
Aidan Gomez
Erich Elsen
12
101
0
25 Jun 2019
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase
  Spectra for Statistical Parametric Speech Synthesis
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis
Yang Ai
Zhenhua Ling
21
29
0
23 Jun 2019
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for
  Acoustic Modeling
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling
Yuanhao Yi
Yang Ai
Zhenhua Ling
Lirong Dai
15
33
0
21 Jun 2019
Parametric Resynthesis with neural vocoders
Parametric Resynthesis with neural vocoders
Soumi Maiti
Michael I. Mandel
17
19
0
16 Jun 2019
A Signal Propagation Perspective for Pruning Neural Networks at
  Initialization
A Signal Propagation Perspective for Pruning Neural Networks at Initialization
Namhoon Lee
Thalaiyasingam Ajanthan
Stephen Gould
Philip Torr
AAML
22
152
0
14 Jun 2019
Non-Differentiable Supervised Learning with Evolution Strategies and
  Hybrid Methods
Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods
Karel Lenc
Erich Elsen
Tom Schaul
Karen Simonyan
15
20
0
07 Jun 2019
MelNet: A Generative Model for Audio in the Frequency Domain
MelNet: A Generative Model for Audio in the Frequency Domain
Sean Vasquez
M. Lewis
DiffM
27
131
0
04 Jun 2019
Blow: a single-scale hyperconditioned flow for non-parallel raw-audio
  voice conversion
Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion
Joan Serrà
Santiago Pascual
Carlos Segura
CVBM
20
84
0
03 Jun 2019
Rethinking Full Connectivity in Recurrent Neural Networks
Rethinking Full Connectivity in Recurrent Neural Networks
Matthijs Van Keirsbilck
A. Keller
Xiaodong Yang
LRM
14
13
0
29 May 2019
SignalTrain: Profiling Audio Compressors with Deep Neural Networks
SignalTrain: Profiling Audio Compressors with Deep Neural Networks
Scott H. Hawley
Benjamin Colburn
S. I. Mimilakis
14
12
0
28 May 2019
Non-Autoregressive Neural Text-to-Speech
Non-Autoregressive Neural Text-to-Speech
Kainan Peng
Ming-Yu Liu
Z. Song
Kexin Zhao
29
39
0
21 May 2019
Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with
  Hierarchical Latent Variables
Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with Hierarchical Latent Variables
F. Kingma
Pieter Abbeel
Jonathan Ho
11
96
0
16 May 2019
MoGlow: Probabilistic and controllable motion synthesis using
  normalising flows
MoGlow: Probabilistic and controllable motion synthesis using normalising flows
G. Henter
Simon Alexanderson
Jonas Beskow
39
97
0
16 May 2019
Improving Opus Low Bit Rate Quality with Neural Speech Synthesis
Improving Opus Low Bit Rate Quality with Neural Speech Synthesis
Jan Skoglund
J. Valin
39
38
0
12 May 2019
High quality, lightweight and adaptable TTS using LPCNet
High quality, lightweight and adaptable TTS using LPCNet
Zvi Kons
Slava Shechtman
A. Sorin
Carmel Rabinovitz
R. Hoory
17
54
0
02 May 2019
Deep Learning for Audio Signal Processing
Deep Learning for Audio Signal Processing
Hendrik Purwins
Bo-wen Li
Tuomas Virtanen
Jan Schlüter
Shuo-yiin Chang
Tara N. Sainath
VLM
24
586
0
30 Apr 2019
Neural source-filter waveform models for statistical parametric speech
  synthesis
Neural source-filter waveform models for statistical parametric speech synthesis
Xin Wang
Shinji Takaki
Junichi Yamagishi
40
117
0
27 Apr 2019
Singing voice synthesis based on convolutional neural networks
Singing voice synthesis based on convolutional neural networks
Kazuhiro Nakamura
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
26
33
0
15 Apr 2019
Direct speech-to-speech translation with a sequence-to-sequence model
Direct speech-to-speech translation with a sequence-to-sequence model
Ye Jia
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Z. Chen
Yonghui Wu
21
223
0
12 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm
A New GAN-based End-to-End TTS Training Algorithm
Haohan Guo
Frank Soong
Lei He
Lei Xie
26
47
0
09 Apr 2019
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS
Haohan Guo
Frank Soong
Lei He
Lei Xie
18
30
0
09 Apr 2019
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its
  Applications to Hearing-Impaired Speech and Speech Separation
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation
Fadi Biadsy
Ron J. Weiss
Pedro J. Moreno
D. Kanvesky
Ye Jia
21
112
0
08 Apr 2019
GELP: GAN-Excited Linear Prediction for Speech Synthesis from
  Mel-spectrogram
GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram
Lauri Juvela
Bajibabu Bollepalli
Junichi Yamagishi
P. Alku
13
18
0
08 Apr 2019
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform
  Generation
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
22
19
0
05 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Z. Chen
Yonghui Wu
11
910
0
05 Apr 2019
A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
J. Valin
Jan Skoglund
24
78
0
28 Mar 2019
The State of Sparsity in Deep Neural Networks
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
33
745
0
25 Feb 2019
Capacity allocation through neural network layers
Capacity allocation through neural network layers
Jonathan Donier
14
3
0
22 Feb 2019
Capacity allocation analysis of neural networks: A tool for principled
  architecture design
Capacity allocation analysis of neural networks: A tool for principled architecture design
Jonathan Donier
22
4
0
12 Feb 2019
Previous
123...1089
Next