ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08435
  4. Cited By
Efficient Neural Audio Synthesis
v1v2 (latest)

Efficient Neural Audio Synthesis

23 February 2018
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
ArXiv (abs)PDFHTML

Papers citing "Efficient Neural Audio Synthesis"

50 / 469 papers shown
Title
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
WaveFake: A Data Set to Facilitate Audio Deepfake Detection
Joel Frank
Lea Schonherr
DiffM
204
131
0
04 Nov 2021
RefineGAN: Universally Generating Waveform Better than Ground Truth with
  Highly Accurate Pitch and Intensity Responses
RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses
Shengyuan Xu
Wenxiao Zhao
Jing Guo
63
12
0
01 Nov 2021
TorchAudio: Building Blocks for Audio and Speech Processing
TorchAudio: Building Blocks for Audio and Speech Processing
Yao-Yuan Yang
Moto Hira
Zhaoheng Ni
Anjali Chourdia
Artyom Astafurov
...
Mehrzad Samadi
Shinji Watanabe
Soumith Chintala
Vincent Quenneville-Bélair
Yangyang Shi
106
169
0
28 Oct 2021
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Chunked Autoregressive GAN for Conditional Waveform Synthesis
Max Morrison
Rithesh Kumar
Kundan Kumar
Prem Seetharaman
Aaron Courville
Yoshua Bengio
GAN
130
72
0
19 Oct 2021
Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and
  Text Encoder Aggregation
Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation
Fengyu Yang
Jian Luan
Yujun Wang
137
5
0
19 Oct 2021
KaraTuner: Towards end to end natural pitch correction for singing voice
  in karaoke
KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke
Xiaobin Zhuang
Huiran Yu
Weifeng Zhao
Tao Jiang
Peng Hu
90
6
0
18 Oct 2021
VISinger: Variational Inference with Adversarial Learning for End-to-End
  Singing Voice Synthesis
VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis
Yongmao Zhang
Jian Cong
Heyang Xue
Lei Xie
Pengcheng Zhu
Mengxiao Bi
97
77
0
17 Oct 2021
PixelPyramids: Exact Inference Models from Lossless Image Pyramids
PixelPyramids: Exact Inference Models from Lossless Image Pyramids
Shweta Mahajan
Stefan Roth
TPM
51
2
0
17 Oct 2021
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice
  Generation
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation
Rongjie Huang
Chenye Cui
Feiyang Chen
Yi Ren
Jinglin Liu
Zhou Zhao
Baoxing Huai
N. Yuan
GAN
203
63
0
14 Oct 2021
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Haitong Zhang
Yue Lin
56
0
0
14 Oct 2021
Revisiting IPA-based Cross-lingual Text-to-speech
Revisiting IPA-based Cross-lingual Text-to-speech
Haitong Zhang
Haoyue Zhan
Yang Zhang
Xinyuan Yu
Yue Lin
61
7
0
14 Oct 2021
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding
Sergey Nikonorov
Berrak Sisman
Mingyang Zhang
Haizhou Li
41
3
0
13 Oct 2021
Denoising Diffusion Gamma Models
Denoising Diffusion Gamma Models
Eliya Nachmani
S. Robin
Lior Wolf
DiffMVLM
81
32
0
10 Oct 2021
Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis
Mu Yang
Shaojin Ding
Tianlong Chen
Tong Wang
Zhangyang Wang
CLL
73
5
0
09 Oct 2021
Using multiple reference audios and style embedding constraints for
  speech synthesis
Using multiple reference audios and style embedding constraints for speech synthesis
Cheng Gong
Longbiao Wang
Zhenhua Ling
Ju Zhang
Jianwu Dang
48
5
0
09 Oct 2021
Cross-speaker Emotion Transfer Based on Speaker Condition Layer
  Normalization and Semi-Supervised Training in Text-To-Speech
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Pengfei Wu
Junjie Pan
Chenchang Xu
Junhui Zhang
Lin Wu
Xiang Yin
Zejun Ma
72
16
0
08 Oct 2021
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic
  Voice Over
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over
Junchen Lu
Berrak Sisman
Rui Liu
Mingyang Zhang
Haizhou Li
DiffM
91
20
0
07 Oct 2021
End-to-End Supermask Pruning: Learning to Prune Image Captioning Models
End-to-End Supermask Pruning: Learning to Prune Image Captioning Models
J. Tan
C. Chan
Joon Huang Chuah
VLM
132
16
0
07 Oct 2021
Emphasis control for parallel neural TTS
Emphasis control for parallel neural TTS
Shreyas Seshadri
T. Raitio
D. Castellani
Jiangchuan Li
120
11
0
06 Oct 2021
Hierarchical prosody modeling and control in non-autoregressive parallel
  neural TTS
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS
T. Raitio
Jiangchuan Li
Shreyas Seshadri
78
23
0
06 Oct 2021
Autoregressive Diffusion Models
Autoregressive Diffusion Models
Emiel Hoogeboom
Alexey A. Gritsenko
Jasmijn Bastings
Ben Poole
Rianne van den Berg
Tim Salimans
DiffM
127
155
0
05 Oct 2021
On the Interplay Between Sparsity, Naturalness, Intelligibility, and
  Prosody in Speech Synthesis
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis
Cheng-I Jeff Lai
Erica Cooper
Yang Zhang
Shiyu Chang
Kaizhi Qian
...
Yung-Sung Chuang
Alexander H. Liu
Junichi Yamagishi
David D. Cox
James R. Glass
69
6
0
04 Oct 2021
Powerpropagation: A sparsity inducing weight reparameterisation
Powerpropagation: A sparsity inducing weight reparameterisation
Jonathan Richard Schwarz
Siddhant M. Jayakumar
Razvan Pascanu
P. Latham
Yee Whye Teh
194
55
0
01 Oct 2021
On-device neural speech synthesis
On-device neural speech synthesis
Sivanand Achanta
Albert Antony
L. Golipour
Jiangchuan Li
T. Raitio
...
Francesco Rossi
Jennifer Shi
Jaimin Upadhyay
David Winarsky
Hepeng Zhang
108
17
0
17 Sep 2021
DDS: A new device-degraded speech dataset for speech enhancement
DDS: A new device-degraded speech dataset for speech enhancement
Haoyu Li
Junichi Yamagishi
92
9
0
16 Sep 2021
Bilateral Denoising Diffusion Models
Bilateral Denoising Diffusion Models
Max W. Y. Lam
Jun Wang
Rongjie Huang
Jane Polak Scowcroft
Dong Yu
DiffM
83
43
0
26 Aug 2021
Combining speakers of multiple languages to improve quality of neural
  voices
Combining speakers of multiple languages to improve quality of neural voices
Javier Latorre
Charlotte Bailleul
Tuuli H. Morrill
Alistair Conkie
Y. Stylianou
64
8
0
17 Aug 2021
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
Ahmed Mustafa
Jan Büthe
Srikanth Korse
Kishan Gupta
Guillaume Fuchs
N. Pia
131
19
0
09 Aug 2021
A Tandem Framework Balancing Privacy and Security for Voice User
  Interfaces
A Tandem Framework Balancing Privacy and Security for Voice User Interfaces
Ranya Aloufi
Hamed Haddadi
David E. Boyle
90
3
0
21 Jul 2021
Approximation Theory of Convolutional Architectures for Time Series
  Modelling
Approximation Theory of Convolutional Architectures for Time Series Modelling
Haotian Jiang
Zhong Li
Qianxiao Li
AI4TS
83
12
0
20 Jul 2021
Translatotron 2: High-quality direct speech-to-speech translation with
  voice preservation
Translatotron 2: High-quality direct speech-to-speech translation with voice preservation
Ye Jia
Michelle Tadmor Ramanovich
Tal Remez
Roi Pomerantz
105
73
0
19 Jul 2021
Codified audio language modeling learns useful representations for music
  information retrieval
Codified audio language modeling learns useful representations for music information retrieval
Rodrigo Castellon
Chris Donahue
Percy Liang
146
91
0
12 Jul 2021
SoundStream: An End-to-End Neural Audio Codec
SoundStream: An End-to-End Neural Audio Codec
Neil Zeghidour
Alejandro Luebs
Ahmed Omran
Jan Skoglund
Marco Tagliasacchi
AI4TS
120
806
0
07 Jul 2021
Adversarial Auto-Encoding for Packet Loss Concealment
Adversarial Auto-Encoding for Packet Loss Concealment
Santiago Pascual
Joan Serrà
Jordi Pons
71
29
0
07 Jul 2021
A Generative Model for Raw Audio Using Transformer Architectures
A Generative Model for Raw Audio Using Transformer Architectures
Prateek Verma
C. Chafe
79
29
0
30 Jun 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
133
359
0
29 Jun 2021
Deep Ensembling with No Overhead for either Training or Testing: The
  All-Round Blessings of Dynamic Sparsity
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
Shiwei Liu
Tianlong Chen
Zahra Atashgahi
Xiaohan Chen
Ghada Sokar
Elena Mocanu
Mykola Pechenizkiy
Zhangyang Wang
Decebal Constantin Mocanu
OOD
129
53
0
28 Jun 2021
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition
Zhengxi Liu
Y. Qian
DRL
49
10
0
25 Jun 2021
Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource
  Highly Expressive Speech
Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech
Raahil Shah
Kamil Pokora
Abdelhamid Ezzerg
V. Klimkov
Goeric Huybrechts
Bartosz Putrycz
Daniel Korzekwa
Thomas Merritt
64
26
0
24 Jun 2021
Distilling the Knowledge from Conditional Normalizing Flows
Distilling the Knowledge from Conditional Normalizing Flows
Dmitry Baranchuk
Vladimir Aliev
Artem Babenko
BDL
85
2
0
24 Jun 2021
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational
  Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis
Jian Cong
Shan Yang
Lei Xie
Jane Polak Scowcroft
DRL
110
29
0
21 Jun 2021
Controllable Context-aware Conversational Speech Synthesis
Controllable Context-aware Conversational Speech Synthesis
Jian Cong
Shan Yang
Na Hu
Guangzhi Li
Lei Xie
Jane Polak Scowcroft
73
30
0
21 Jun 2021
Enriching Source Style Transfer in Recognition-Synthesis based
  Non-Parallel Voice Conversion
Enriching Source Style Transfer in Recognition-Synthesis based Non-Parallel Voice Conversion
Zhichao Wang
Xinyong Zhou
Fengyu Yang
Tao Li
Hongqiang Du
Lei Xie
Wendong Gan
Haitao Chen
Hai Li
65
22
0
16 Jun 2021
Improving the expressiveness of neural vocoding with non-affine
  Normalizing Flows
Improving the expressiveness of neural vocoding with non-affine Normalizing Flows
Adam Gabry's
Yunlong Jiao
V. Klimkov
Daniel Korzekwa
Roberto Barra-Chicote
48
1
0
16 Jun 2021
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis
D. Mohan
Qinmin Hu
Tian Huey Teh
Alexandra Torresquintero
C. Wallis
Marlene Staib
Lorenzo Foglianti
Jiameng Gao
Simon King
55
17
0
15 Jun 2021
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram
  Discriminators for High-Fidelity Waveform Generation
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Won Jang
D. Lim
Jaesam Yoon
Bongwan Kim
Juntae Kim
116
132
0
15 Jun 2021
Non Gaussian Denoising Diffusion Models
Non Gaussian Denoising Diffusion Models
Eliya Nachmani
Robin San Roman
Lior Wolf
VLMDiffM
83
50
0
14 Jun 2021
PriorGrad: Improving Conditional Denoising Diffusion Models with
  Data-Dependent Adaptive Prior
PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior
Sang-gil Lee
Heeseung Kim
Chaehun Shin
Xu Tan
Chang-Shu Liu
Qi Meng
Tao Qin
Wei Chen
Sung-Hoon Yoon
Tie-Yan Liu
DiffM
85
89
0
11 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for
  End-to-End Text-to-Speech
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Jaehyeon Kim
Jungil Kong
Juhee Son
DRL
167
903
0
11 Jun 2021
Top-KAST: Top-K Always Sparse Training
Top-KAST: Top-K Always Sparse Training
Siddhant M. Jayakumar
Razvan Pascanu
Jack W. Rae
Simon Osindero
Erich Elsen
184
100
0
07 Jun 2021
Previous
123456...8910
Next