ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08435
  4. Cited By
Efficient Neural Audio Synthesis

Efficient Neural Audio Synthesis

23 February 2018
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
ArXivPDFHTML

Papers citing "Efficient Neural Audio Synthesis"

50 / 472 papers shown
Title
Learning to Recover from Multi-Modality Errors for Non-Autoregressive
  Neural Machine Translation
Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation
Qiu Ran
Yankai Lin
Peng Li
Jie Zhou
21
39
0
09 Jun 2020
End-to-End Adversarial Text-to-Speech
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
17
185
0
05 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment
  Search
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
54
477
0
22 May 2020
Cross-lingual Multispeaker Text-to-Speech under Limited-Data Scenario
Cross-lingual Multispeaker Text-to-Speech under Limited-Data Scenario
Zexin Cai
Yaogen Yang
Ming Li
11
9
0
21 May 2020
Conversational End-to-End TTS for Voice Agent
Conversational End-to-End TTS for Voice Agent
Haohan Guo
Shaofei Zhang
Frank Soong
Lei He
Lei Xie
26
67
0
21 May 2020
The Effectiveness of Discretization in Forecasting: An Empirical Study
  on Neural Time Series Models
The Effectiveness of Discretization in Forecasting: An Empirical Study on Neural Time Series Models
Stephan Rabanser
Tim Januschowski
Valentin Flunkert
David Salinas
Jan Gasthaus
BDL
AI4TS
22
20
0
20 May 2020
Improving Accent Conversion with Reference Encoder and End-To-End
  Text-To-Speech
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Wenjie Li
Benlai Tang
Xiang Yin
Yushi Zhao
Wei Li
Kang Wang
Hao Huang
Yuxuan Wang
Zejun Ma
14
13
0
19 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive
  Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu
Tomoki Hayashi
T. Okamoto
Hisashi Kawai
T. Toda
29
4
0
18 May 2020
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with
  Monotonic Boundary Search
MoBoAligner: a Neural Alignment Model for Non-autoregressive TTS with Monotonic Boundary Search
Naihan Li
Shujie Liu
Yanqing Liu
Sheng Zhao
Ming Liu
Ming Zhou
6
6
0
18 May 2020
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based
  Variable-Length Embedding
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length Embedding
Seungwoo Choi
Seungju Han
Dongyoung Kim
S. Ha
37
65
0
18 May 2020
Many-to-Many Voice Transformer Network
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
T. Toda
ViT
30
30
0
18 May 2020
Improved Prosody from Learned F0 Codebook Representations for VQ-VAE
  Speech Waveform Reconstruction
Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction
Yi Zhao
Haoyu Li
Cheng-I Jeff Lai
Jennifer Williams
Erica Cooper
Junichi Yamagishi
39
18
0
16 May 2020
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU
Po-Chun Hsu
Hung-yi Lee
11
16
0
15 May 2020
Reverberation Modeling for Source-Filter-based Neural Vocoder
Reverberation Modeling for Source-Filter-based Neural Vocoder
Yang Ai
Xin Wang
Junichi Yamagishi
Zhenhua Ling
20
3
0
15 May 2020
AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN
AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN
Zewang Zhang
Qiao Tian
Heng Lu
Ling-Hao Chen
Shan Liu
7
27
0
12 May 2020
FeatherWave: An efficient high-fidelity neural vocoder with multi-band
  linear prediction
FeatherWave: An efficient high-fidelity neural vocoder with multi-band linear prediction
Qiao Tian
Zewang Zhang
Heng Lu
Linghui Chen
Shan Liu
16
22
0
12 May 2020
Multi-band MelGAN: Faster Waveform Generation for High-Quality
  Text-to-Speech
Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech
Geng Yang
Shan Yang
Kai-Chun Liu
Peng Fang
Wei Chen
Lei Xie
66
198
0
11 May 2020
GACELA -- A generative adversarial context encoder for long audio
  inpainting
GACELA -- A generative adversarial context encoder for long audio inpainting
Andrés Marafioti
P. Majdak
Nicki Holighaus
Nathanael Perraudin
35
43
0
11 May 2020
From Speaker Verification to Multispeaker Speech Synthesis, Deep
  Transfer with Feedback Constraint
From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint
Zexin Cai
Chuxiong Zhang
Ming Li
24
41
0
10 May 2020
TIRAMISU: A Polyhedral Compiler for Dense and Sparse Deep Learning
TIRAMISU: A Polyhedral Compiler for Dense and Sparse Deep Learning
Riyadh Baghdadi
Abdelkader Nadir Debbagh
K. Abdous
Fatima-Zohra Benhamida
Alex Renda
Jonathan Frankle
Michael Carbin
Saman P. Amarasinghe
13
16
0
07 May 2020
Jukebox: A Generative Model for Music
Jukebox: A Generative Model for Music
Prafulla Dhariwal
Heewoo Jun
Christine Payne
Jong Wook Kim
Alec Radford
Ilya Sutskever
VLM
52
722
0
30 Apr 2020
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural
  Text-to-Speech
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech
S. Karlapati
Alexis Moinet
Arnaud Joly
V. Klimkov
Daniel Sáez-Trigueros
Thomas Drugman
11
67
0
30 Apr 2020
ByteSing: A Chinese Singing Voice Synthesis System Using Duration
  Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders
Yu Gu
Xiang Yin
Yonghui Rao
Yuan Wan
Benlai Tang
Yang Zhang
Jitong Chen
Yuxuan Wang
Zejun Ma
20
70
0
23 Apr 2020
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical
  Neural Vocoders
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders
Yang Ai
Zhenhua Ling
13
8
0
16 Apr 2020
Generating Multilingual Voices Using Speaker Space Translation Based on
  Bilingual Speaker Data
Generating Multilingual Voices Using Speaker Space Translation Based on Bilingual Speaker Data
Soumi Maiti
Erik Marchi
Alistair Conkie
11
17
0
10 Apr 2020
Normalizing Flows with Multi-Scale Autoregressive Priors
Normalizing Flows with Multi-Scale Autoregressive Priors
Shweta Mahajan
Apratim Bhattacharyya
Mario Fritz
Bernt Schiele
Stefan Roth
BDL
DRL
12
16
0
08 Apr 2020
Improving Perceptual Quality of Drum Transcription with the Expanded
  Groove MIDI Dataset
Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset
Lee F. Callender
Curtis Hawthorne
Jesse Engel
43
20
0
01 Apr 2020
Speech Quality Factors for Traditional and Neural-Based Low Bit Rate
  Vocoders
Speech Quality Factors for Traditional and Neural-Based Low Bit Rate Vocoders
Wissam A. Jassim
Jan Skoglund
Michael Chinen
Andrew Hines
14
8
0
26 Mar 2020
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
191
1,032
0
06 Mar 2020
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit
  Alignment
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment
Zhen Zeng
Jianzong Wang
Ning Cheng
Tian Xia
Jing Xiao
VLM
33
56
0
04 Mar 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training
  and Inference of Transformers
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan Li
Eric Wallace
Sheng Shen
Kevin Lin
Kurt Keutzer
Dan Klein
Joseph E. Gonzalez
22
148
0
26 Feb 2020
Lifter Training and Sub-band Modeling for Computationally Efficient and
  High-Quality Voice Conversion Using Spectral Differentials
Lifter Training and Sub-band Modeling for Computationally Efficient and High-Quality Voice Conversion Using Spectral Differentials
Takaaki Saeki
Yuki Saito
Shinnosuke Takamichi
Hiroshi Saruwatari
7
4
0
17 Feb 2020
Speech-to-Singing Conversion in an Encoder-Decoder Framework
Speech-to-Singing Conversion in an Encoder-Decoder Framework
Jayneel Parekh
Preeti Rao
Yi-Hsuan Yang
25
11
0
16 Feb 2020
Many-to-Many Voice Conversion using Conditional Cycle-Consistent
  Adversarial Networks
Many-to-Many Voice Conversion using Conditional Cycle-Consistent Adversarial Networks
Shindong Lee
Bonggu Ko
Keonnyeong Lee
In-Chul Yoo
Dongsuk Yook
GAN
30
33
0
15 Feb 2020
Real-time speech enhancement using equilibriated RNN
Real-time speech enhancement using equilibriated RNN
Daiki Takeuchi
Kohei Yatabe
Yuma Koizumi
Yasuhiro Oikawa
N. Harada
20
34
0
14 Feb 2020
Efficient And Scalable Neural Residual Waveform Coding With
  Collaborative Quantization
Efficient And Scalable Neural Residual Waveform Coding With Collaborative Quantization
Kai Zhen
Mi Suk Lee
Jongmo Sung
Seungkwon Beack
Minje Kim
35
20
0
13 Feb 2020
Generating diverse and natural text-to-speech samples using a quantized
  fine-grained VAE and auto-regressive prosody prior
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior
Guangzhi Sun
Yu Zhang
Ron J. Weiss
Yuan Cao
Heiga Zen
Andrew Rosenberg
Bhuvana Ramabhadran
Yonghui Wu
DiffM
36
92
0
06 Feb 2020
Vocoder-free End-to-End Voice Conversion with Transformer Network
Vocoder-free End-to-End Voice Conversion with Transformer Network
June-Woo Kim
H. Jung
Minho Lee
30
4
0
05 Feb 2020
Scaling Up Online Speech Recognition Using ConvNets
Scaling Up Online Speech Recognition Using ConvNets
Vineel Pratap
Qiantong Xu
Jacob Kahn
Gilad Avidov
Tatiana Likhomanenko
Awni Y. Hannun
Vitaliy Liptchinsky
Gabriel Synnaeve
R. Collobert
154
38
0
27 Jan 2020
SqueezeWave: Extremely Lightweight Vocoders for On-device Speech
  Synthesis
SqueezeWave: Extremely Lightweight Vocoders for On-device Speech Synthesis
Bohan Zhai
Tianren Gao
Flora Xue
D. Rothchild
Bichen Wu
Joseph E. Gonzalez
Kurt Keutzer
21
27
0
16 Jan 2020
DDSP: Differentiable Digital Signal Processing
DDSP: Differentiable Digital Signal Processing
Jesse Engel
Lamtharn Hantrakul
Chenjie Gu
Adam Roberts
DiffM
96
373
0
14 Jan 2020
Synthesising Expressiveness in Peking Opera via Duration Informed
  Attention Network
Synthesising Expressiveness in Peking Opera via Duration Informed Attention Network
Yusong Wu
Shengchen Li
Chengzhu Yu
Heng Lu
Chao Weng
Liqiang Zhang
Dong Yu
18
5
0
27 Dec 2019
Score and Lyrics-Free Singing Voice Generation
Score and Lyrics-Free Singing Voice Generation
Jen-Yu Liu
Yu-Hua Chen
Yin-Cheng Yeh
Yi-Hsuan Yang
24
22
0
26 Dec 2019
Learning Singing From Speech
Learning Singing From Speech
Liqiang Zhang
Chengzhu Yu
Heng Lu
Chao Weng
Yusong Wu
Xiang Xie
Zijin Li
Dong Yu
15
8
0
20 Dec 2019
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
36
242
0
06 Dec 2019
Towards Robust Neural Vocoding for Speech Generation: A Survey
Towards Robust Neural Vocoding for Speech Generation: A Survey
Po-Chun Hsu
Chun-hsuan Wang
Andy T. Liu
Hung-yi Lee
OOD
17
24
0
05 Dec 2019
WaveFlow: A Compact Flow-based Model for Raw Audio
WaveFlow: A Compact Flow-based Model for Raw Audio
Ming-Yu Liu
Kainan Peng
Kexin Zhao
Z. Song
20
116
0
03 Dec 2019
Rigging the Lottery: Making All Tickets Winners
Rigging the Lottery: Making All Tickets Winners
Utku Evci
Trevor Gale
Jacob Menick
Pablo Samuel Castro
Erich Elsen
16
588
0
25 Nov 2019
Fast Sparse ConvNets
Fast Sparse ConvNets
Erich Elsen
Marat Dukhan
Trevor Gale
Karen Simonyan
21
151
0
21 Nov 2019
Prosody Transfer in Neural Text to Speech Using Global Pitch and
  Loudness Features
Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features
Francesco Ferroni
Kilol Gupta
D. Shah
Z. Shakeri
Jervis Pinto
15
15
0
21 Nov 2019
Previous
123...10789
Next