ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08435
  4. Cited By
Efficient Neural Audio Synthesis
v1v2 (latest)

Efficient Neural Audio Synthesis

23 February 2018
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
ArXiv (abs)PDFHTML

Papers citing "Efficient Neural Audio Synthesis"

50 / 469 papers shown
Title
Deep Long Audio Inpainting
Deep Long Audio Inpainting
Ya-Liang Chang
Kuan-Ying Lee
Po-Yu Wu
Hung-yi Lee
Winston H. Hsu
68
33
0
15 Nov 2019
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence
  Modelling
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
Ruizhe Zhao
Brian K. Vogel
Tanvir Ahmed
Wayne Luk
61
37
0
14 Nov 2019
What Do Compressed Deep Neural Networks Forget?
What Do Compressed Deep Neural Networks Forget?
Sara Hooker
Aaron Courville
Gregory Clark
Yann N. Dauphin
Andrea Frome
118
185
0
13 Nov 2019
A unified sequence-to-sequence front-end model for Mandarin
  text-to-speech synthesis
A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis
Junjie Pan
Xiang Yin
Zhiling Zhang
Shichao Liu
Yang Zhang
Zejun Ma
Yuxuan Wang
47
27
0
11 Nov 2019
Towards Fine-Grained Prosody Control for Voice Conversion
Towards Fine-Grained Prosody Control for Voice Conversion
Zheng Lian
Zhengqi Wen
70
19
0
24 Oct 2019
Fast and High-Quality Singing Voice Synthesis System based on
  Convolutional Neural Networks
Fast and High-Quality Singing Voice Synthesis System based on Convolutional Neural Networks
Kazuhiro Nakamura
Shinji Takaki
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
84
19
0
24 Oct 2019
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source
  End-to-End Text-to-Speech Toolkit
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Tomoki Hayashi
Ryuichi Yamamoto
Katsuki Inoue
Takenori Yoshimura
Shinji Watanabe
Tomoki Toda
K. Takeda
Yu Zhang
Xu Tan
VLM
93
205
0
24 Oct 2019
Location-Relative Attention Mechanisms For Robust Long-Form Speech
  Synthesis
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
Eric Battenberg
RJ Skerry-Ryan
Soroosh Mariooryad
Daisy Stanton
David Kao
Matt Shannon
Tom Bagby
106
114
0
23 Oct 2019
MelGAN: Generative Adversarial Networks for Conditional Waveform
  Synthesis
MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
Kundan Kumar
Rithesh Kumar
T. Boissière
L. Gestin
Wei Zhen Teoh
Jose M. R. Sotelo
A. D. Brébisson
Yoshua Bengio
Aaron Courville
GAN
178
962
0
08 Oct 2019
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Semi-Supervised Generative Modeling for Controllable Speech Synthesis
Raza Habib
Soroosh Mariooryad
Matt Shannon
Eric Battenberg
RJ Skerry-Ryan
Daisy Stanton
David Kao
Tom Bagby
BDL
68
48
0
03 Oct 2019
Attention Forcing for Sequence-to-sequence Model Training
Attention Forcing for Sequence-to-sequence Model Training
Qingyun Dou
Yiting Lu
Joshua Efiong
Mark Gales
62
6
0
26 Sep 2019
Speech Recognition with Augmented Synthesized Speech
Speech Recognition with Augmented Synthesized Speech
Andrew Rosenberg
Yu Zhang
Bhuvana Ramabhadran
Ye Jia
Pedro J. Moreno
Yonghui Wu
Zelin Wu
69
128
0
25 Sep 2019
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
316
240
0
25 Sep 2019
Many-to-Many Voice Conversion using Cycle-Consistent Variational
  Autoencoder with Multiple Decoders
Many-to-Many Voice Conversion using Cycle-Consistent Variational Autoencoder with Multiple Decoders
Keonnyeong Lee
In-Chul Yoo
Dongsuk Yook
DRL
77
14
0
15 Sep 2019
DurIAN: Duration Informed Attention Network For Multimodal Synthesis
DurIAN: Duration Informed Attention Network For Multimodal Synthesis
Chengzhu Yu
Heng Lu
Na Hu
Meng Yu
Chao Weng
...
Deyi Tuo
Shiyin Kang
Guangzhi Lei
Jane Polak Scowcroft
Dong Yu
CVBM
92
118
0
04 Sep 2019
Maximizing Mutual Information for Tacotron
Maximizing Mutual Information for Tacotron
Peng Liu
Xixin Wu
Shiyin Kang
Guangzhi Li
Jane Polak Scowcroft
Dong Yu
86
16
0
30 Aug 2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech
  Synthesis and Cross-Language Voice Cloning
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Yu Zhang
Ron J. Weiss
Heiga Zen
Yonghui Wu
Zhiwen Chen
RJ Skerry-Ryan
Ye Jia
Andrew Rosenberg
Bhuvana Ramabhadran
76
189
0
09 Jul 2019
Speech bandwidth extension with WaveNet
Speech bandwidth extension with WaveNet
Archit Gupta
Brendan Shillingford
Yannis Assael
Thomas C. Walters
60
29
0
05 Jul 2019
A Methodology for Controlling the Emotional Expressiveness in Synthetic
  Speech -- a Deep Learning approach
A Methodology for Controlling the Emotional Expressiveness in Synthetic Speech -- a Deep Learning approach
Noé Tits
40
10
0
05 Jul 2019
Improving Performance of End-to-End ASR on Numeric Sequences
Improving Performance of End-to-End ASR on Numeric Sequences
Cal Peyser
Hao Zhang
Tara N. Sainath
Zelin Wu
AI4TS
63
36
0
01 Jul 2019
The Difficulty of Training Sparse Neural Networks
The Difficulty of Training Sparse Neural Networks
Utku Evci
Fabian Pedregosa
Aidan Gomez
Erich Elsen
81
101
0
25 Jun 2019
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase
  Spectra for Statistical Parametric Speech Synthesis
A Neural Vocoder with Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis
Yang Ai
Zhenhua Ling
123
29
0
23 Jun 2019
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for
  Acoustic Modeling
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling
Yuanhao Yi
Yang Ai
Zhenhua Ling
Lirong Dai
56
33
0
21 Jun 2019
Parametric Resynthesis with neural vocoders
Parametric Resynthesis with neural vocoders
Soumi Maiti
Michael I. Mandel
68
19
0
16 Jun 2019
A Signal Propagation Perspective for Pruning Neural Networks at
  Initialization
A Signal Propagation Perspective for Pruning Neural Networks at Initialization
Namhoon Lee
Thalaiyasingam Ajanthan
Stephen Gould
Philip Torr
AAML
78
156
0
14 Jun 2019
Non-Differentiable Supervised Learning with Evolution Strategies and
  Hybrid Methods
Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods
Karel Lenc
Erich Elsen
Tom Schaul
Karen Simonyan
49
20
0
07 Jun 2019
MelNet: A Generative Model for Audio in the Frequency Domain
MelNet: A Generative Model for Audio in the Frequency Domain
Sean Vasquez
M. Lewis
DiffM
85
132
0
04 Jun 2019
Blow: a single-scale hyperconditioned flow for non-parallel raw-audio
  voice conversion
Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion
Joan Serrà
Santiago Pascual
Carlos Segura
CVBM
76
85
0
03 Jun 2019
Rethinking Full Connectivity in Recurrent Neural Networks
Rethinking Full Connectivity in Recurrent Neural Networks
Matthijs Van Keirsbilck
A. Keller
Xiaodong Yang
LRM
41
14
0
29 May 2019
SignalTrain: Profiling Audio Compressors with Deep Neural Networks
SignalTrain: Profiling Audio Compressors with Deep Neural Networks
Scott H. Hawley
Benjamin Colburn
S. I. Mimilakis
42
12
0
28 May 2019
Non-Autoregressive Neural Text-to-Speech
Non-Autoregressive Neural Text-to-Speech
Kainan Peng
Ming-Yu Liu
Z. Song
Kexin Zhao
101
40
0
21 May 2019
Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with
  Hierarchical Latent Variables
Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with Hierarchical Latent Variables
F. Kingma
Pieter Abbeel
Jonathan Ho
106
98
0
16 May 2019
MoGlow: Probabilistic and controllable motion synthesis using
  normalising flows
MoGlow: Probabilistic and controllable motion synthesis using normalising flows
G. Henter
Simon Alexanderson
Jonas Beskow
94
98
0
16 May 2019
Improving Opus Low Bit Rate Quality with Neural Speech Synthesis
Improving Opus Low Bit Rate Quality with Neural Speech Synthesis
Jan Skoglund
J. Valin
88
38
0
12 May 2019
High quality, lightweight and adaptable TTS using LPCNet
High quality, lightweight and adaptable TTS using LPCNet
Zvi Kons
Slava Shechtman
A. Sorin
Carmel Rabinovitz
R. Hoory
69
54
0
02 May 2019
Deep Learning for Audio Signal Processing
Deep Learning for Audio Signal Processing
Hendrik Purwins
Yue Liu
Tuomas Virtanen
Jan Schlüter
Shuo-yiin Chang
Tara N. Sainath
VLM
119
599
0
30 Apr 2019
Neural source-filter waveform models for statistical parametric speech
  synthesis
Neural source-filter waveform models for statistical parametric speech synthesis
Xin Wang
Shinji Takaki
Junichi Yamagishi
97
118
0
27 Apr 2019
Singing voice synthesis based on convolutional neural networks
Singing voice synthesis based on convolutional neural networks
Kazuhiro Nakamura
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
86
33
0
15 Apr 2019
Direct speech-to-speech translation with a sequence-to-sequence model
Direct speech-to-speech translation with a sequence-to-sequence model
Ye Jia
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Zhiwen Chen
Yonghui Wu
103
230
0
12 Apr 2019
A New GAN-based End-to-End TTS Training Algorithm
A New GAN-based End-to-End TTS Training Algorithm
Haohan Guo
Frank Soong
Lei He
Lei Xie
101
47
0
09 Apr 2019
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS
Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS
Haohan Guo
Frank Soong
Lei He
Lei Xie
74
30
0
09 Apr 2019
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its
  Applications to Hearing-Impaired Speech and Speech Separation
Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation
Fadi Biadsy
Ron J. Weiss
Pedro J. Moreno
D. Kanvesky
Ye Jia
97
115
0
08 Apr 2019
GELP: GAN-Excited Linear Prediction for Speech Synthesis from
  Mel-spectrogram
GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-spectrogram
Lauri Juvela
Bajibabu Bollepalli
Junichi Yamagishi
P. Alku
76
18
0
08 Apr 2019
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform
  Generation
WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation
Kou Tanaka
Hirokazu Kameoka
Takuhiro Kaneko
Nobukatsu Hojo
80
19
0
05 Apr 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
164
959
0
05 Apr 2019
A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
J. Valin
Jan Skoglund
62
79
0
28 Mar 2019
The State of Sparsity in Deep Neural Networks
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
193
765
0
25 Feb 2019
Capacity allocation through neural network layers
Capacity allocation through neural network layers
Jonathan Donier
48
3
0
22 Feb 2019
Capacity allocation analysis of neural networks: A tool for principled
  architecture design
Capacity allocation analysis of neural networks: A tool for principled architecture design
Jonathan Donier
49
4
0
12 Feb 2019
Flow++: Improving Flow-Based Generative Models with Variational
  Dequantization and Architecture Design
Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design
Jonathan Ho
Xi Chen
A. Srinivas
Yan Duan
Pieter Abbeel
DRL
105
451
0
01 Feb 2019
Previous
123...1089
Next