ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.08435
  4. Cited By
Efficient Neural Audio Synthesis

Efficient Neural Audio Synthesis

23 February 2018
Nal Kalchbrenner
Erich Elsen
Karen Simonyan
Seb Noury
Norman Casagrande
Edward Lockhart
Florian Stimberg
Aaron van den Oord
Sander Dieleman
Koray Kavukcuoglu
ArXivPDFHTML

Papers citing "Efficient Neural Audio Synthesis"

50 / 472 papers shown
Title
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
Ji-Hoon Kim
Sang-Hoon Lee
Ji-Hyun Lee
Seong-Whan Lee
24
53
0
04 Jun 2021
NVC-Net: End-to-End Adversarial Voice Conversion
NVC-Net: End-to-End Adversarial Voice Conversion
Bac Nguyen Cong
Fabien Cardinaux
AAML
42
41
0
02 Jun 2021
1xN Pattern for Pruning Convolutional Neural Networks
1xN Pattern for Pruning Convolutional Neural Networks
Mingbao Lin
Yu-xin Zhang
Yuchao Li
Bohong Chen
Rongrong Ji
Mengdi Wang
Shen Li
Yonghong Tian
Rongrong Ji
3DPC
33
40
0
31 May 2021
DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion
DiffSVC: A Diffusion Probabilistic Model for Singing Voice Conversion
Songxiang Liu
Yuewen Cao
Dan Su
Helen Meng
DiffM
32
57
0
28 May 2021
High-Fidelity and Low-Latency Universal Neural Vocoder based on
  Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform
  Modeling
High-Fidelity and Low-Latency Universal Neural Vocoder based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling
Patrick Lumban Tobing
T. Toda
49
8
0
20 May 2021
Dual-side Sparse Tensor Core
Dual-side Sparse Tensor Core
Yang-Feng Wang
Chen Zhang
Zhiqiang Xie
Cong Guo
Yunxin Liu
Jingwen Leng
25
75
0
20 May 2021
MASS: Multi-task Anthropomorphic Speech Synthesis Framework
MASS: Multi-task Anthropomorphic Speech Synthesis Framework
Jinyin Chen
Linhui Ye
Zhaoyan Ming
23
6
0
10 May 2021
Protecting gender and identity with disentangled speech representations
Protecting gender and identity with disentangled speech representations
Dimitrios Stoidis
Andrea Cavallaro
30
10
0
22 Apr 2021
Compact CNN Structure Learning by Knowledge Distillation
Compact CNN Structure Learning by Knowledge Distillation
Waqar Ahmed
Andrea Zunino
Pietro Morerio
Vittorio Murino
38
5
0
19 Apr 2021
Accelerating Sparse Deep Neural Networks
Accelerating Sparse Deep Neural Networks
Asit K. Mishra
J. Latorre
Jeff Pool
Darko Stosic
Dusan Stosic
Ganesh Venkatesh
Chong Yu
Paulius Micikevicius
22
222
0
16 Apr 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On
  Factorization of Quasi-Periodic Parallel WaveGAN
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN
Reo Yoneyama
Yi-Chiao Wu
T. Toda
14
12
0
10 Apr 2021
Noise Estimation for Generative Diffusion Models
Noise Estimation for Generative Diffusion Models
Robin San-Roman
Eliya Nachmani
Lior Wolf
DiffM
41
105
0
06 Apr 2021
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
Junhyeok Lee
Seungu Han
DiffM
29
67
0
06 Apr 2021
Multi-rate attention architecture for fast streamable Text-to-speech
  spectrum modeling
Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling
Qing He
Zhiping Xiu
T. Koehler
Jilong Wu
10
7
0
01 Apr 2021
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech:
  Two-stage Sequence-to-Sequence Training
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training
Kun Zhou
Berrak Sisman
Haizhou Li
25
27
0
31 Mar 2021
Training Sparse Neural Network by Constraining Synaptic Weight on Unit
  Lp Sphere
Training Sparse Neural Network by Constraining Synaptic Weight on Unit Lp Sphere
Weipeng Li
Xiaogang Yang
Chuanxiang Li
Ruitao Lu
Xueli Xie
16
0
0
30 Mar 2021
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Ye Jia
Heiga Zen
Jonathan Shen
Yu Zhang
Yonghui Wu
SSL
50
81
0
28 Mar 2021
Scalable and Efficient Neural Speech Coding: A Hybrid Design
Scalable and Efficient Neural Speech Coding: A Hybrid Design
Kai Zhen
Jongmo Sung
Mi Suk Lee
Seung-Wha Beack
Minje Kim
26
13
0
27 Mar 2021
Continual Speaker Adaptation for Text-to-Speech Synthesis
Continual Speaker Adaptation for Text-to-Speech Synthesis
Hamed Hemati
Damian Borth
CLL
24
9
0
26 Mar 2021
Improve GAN-based Neural Vocoder using Pointwise Relativistic
  LeastSquare GAN
Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN
Cong Wang
Yu Chen
Bin Wang
Yi Shi
35
1
0
26 Mar 2021
Latent Space Explorations of Singing Voice Synthesis using DDSP
Latent Space Explorations of Singing Voice Synthesis using DDSP
J. Alonso
Cumhur Erkut
46
12
0
12 Mar 2021
GAN Vocoder: Multi-Resolution Discriminator Is All You Need
GAN Vocoder: Multi-Resolution Discriminator Is All You Need
J. You
Dalhyun Kim
Gyuhyeon Nam
Geumbyeol Hwang
Gyeongsu Chae
21
27
0
09 Mar 2021
Generating Images with Sparse Representations
Generating Images with Sparse Representations
C. Nash
Jacob Menick
Sander Dieleman
Peter W. Battaglia
33
201
0
05 Mar 2021
Compute and memory efficient universal sound source separation
Compute and memory efficient universal sound source separation
Efthymios Tzinis
Zhepei Wang
Xilin Jiang
Paris Smaragdis
26
40
0
03 Mar 2021
Handling Background Noise in Neural Speech Generation
Handling Background Noise in Neural Speech Generation
Tom Denton
Alejandro Luebs
Felicia S. C. Lim
Andrew Storus
Hengchin Yeh
W. Kleijn
Jan Skoglund
13
2
0
23 Feb 2021
Generative Speech Coding with Predictive Variance Regularization
Generative Speech Coding with Predictive Variance Regularization
W. Kleijn
Andrew Storus
Michael Chinen
Tom Denton
Felicia S. C. Lim
Alejandro Luebs
Jan Skoglund
Hengchin Yeh
29
67
0
18 Feb 2021
PeriodNet: A non-autoregressive waveform generation model with a
  structure separating periodic and aperiodic components
PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Yukiya Hono
Shinji Takaki
Kei Hashimoto
Keiichiro Oura
Yoshihiko Nankaku
K. Tokuda
22
16
0
15 Feb 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep
  VAE with Residual Attention
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Peng Liu
Yuewen Cao
Songxiang Liu
Na Hu
Guangzhi Li
Chao Weng
Dan Su
42
22
0
12 Feb 2021
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based
  on Transfer Learning
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning
Giuseppe Ruggiero
Enrico Zovato
Luigi Di Caro
V. Pollet
DiffM
21
9
0
10 Feb 2021
Universal Neural Vocoding with Parallel WaveNet
Universal Neural Vocoding with Parallel WaveNet
Yunlong Jiao
Adam Gabry's
Georgi Tinchev
Bartosz Putrycz
Daniel Korzekwa
V. Klimkov
36
42
0
01 Feb 2021
Triple M: A Practical Text-to-speech Synthesis System With
  Multi-guidance Attention And Multi-band Multi-time LPCNet
Triple M: A Practical Text-to-speech Synthesis System With Multi-guidance Attention And Multi-band Multi-time LPCNet
Shilu Lin
Fenglong Xie
Li Meng
Xinhui Li
Li Lu
11
0
0
30 Jan 2021
Whispered and Lombard Neural Speech Synthesis
Whispered and Lombard Neural Speech Synthesis
Qiong Hu
T. Bleisch
Petko N. Petkov
T. Raitio
Erik Marchi
V. Lakshminarasimhan
4
14
0
13 Jan 2021
Parallel WaveNet conditioned on VAE latent vectors
Parallel WaveNet conditioned on VAE latent vectors
Jonas Rohnke
Thomas Merritt
Jaime Lorenzo-Trueba
Adam Gabry's
Vatsal Aggarwal
Alexis Moinet
Roberto Barra-Chicote
28
3
0
17 Dec 2020
DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech
  Synthesis
DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis
Anurag Chowdhury
Arun Ross
Prabu David
16
5
0
09 Dec 2020
I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at
  Pitch
I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch
Joseph P. Turian
Max Henry
24
29
0
08 Dec 2020
Text-to-speech for the hearing impaired
Text-to-speech for the hearing impaired
Josef Schlittenlacher
T. Baer
14
0
0
03 Dec 2020
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via
  Adversarial Training
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training
Haohan Guo
Heng Lu
Na Hu
Chunlei Zhang
Shan Yang
Lei Xie
Dan Su
Dong Yu
AAML
27
12
0
03 Dec 2020
MelGlow: Efficient Waveform Generative Network Based on
  Location-Variable Convolution
MelGlow: Efficient Waveform Generative Network Based on Location-Variable Convolution
Zhen Zeng
Jianzong Wang
Ning Cheng
Jing Xiao
14
8
0
03 Dec 2020
FBWave: Efficient and Scalable Neural Vocoders for Streaming
  Text-To-Speech on the Edge
FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge
Bichen Wu
Qing He
Peizhao Zhang
T. Koehler
Kurt Keutzer
Peter Vajda
31
6
0
25 Nov 2020
Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech
Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech
Yiling Huang
Yutian Chen
Jason W. Pelecanos
Quan Wang
33
11
0
24 Nov 2020
Empirical Evaluation of Deep Learning Model Compression Techniques on
  the WaveNet Vocoder
Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet Vocoder
Sam Davis
Giuseppe Coccia
Sam Gooch
Julian Mack
14
0
0
20 Nov 2020
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform
  Generation in Multiple Domains
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains
Won Jang
D. Lim
Jaesam Yoon
22
31
0
19 Nov 2020
Towards transformation-resilient provenance detection of digital media
Towards transformation-resilient provenance detection of digital media
Jamie Hayes
Krishnamurthy Dvijotham
Dvijotham
Yutian Chen
Sander Dieleman
Pushmeet Kohli
Norman Casagrande
18
3
0
14 Nov 2020
A Comprehensive Survey on Deep Music Generation: Multi-level
  Representations, Algorithms, Evaluations, and Future Directions
A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions
Shulei Ji
Jing Luo
Xinyu Yang
MGen
13
125
0
13 Nov 2020
Low-resource expressive text-to-speech using data augmentation
Low-resource expressive text-to-speech using data augmentation
Goeric Huybrechts
Thomas Merritt
Giulia Comini
Bartek Perz
Raahil Shah
Jaime Lorenzo-Trueba
26
50
0
11 Nov 2020
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor
  and Neural Waveform Model
Enhancing Low-Quality Voice Recordings Using Disentangled Channel Factor and Neural Waveform Model
Haoyu Li
Yang Ai
Junichi Yamagishi
17
2
0
10 Nov 2020
Pretraining Strategies, Waveform Model Choice, and Acoustic
  Configurations for Multi-Speaker End-to-End Speech Synthesis
Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Erica Cooper
Xin Wang
Yi Zhao
Yusuke Yasuda
Junichi Yamagishi
SyDa
14
3
0
10 Nov 2020
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis
Ron J. Weiss
RJ Skerry-Ryan
Eric Battenberg
Soroosh Mariooryad
Diederik P. Kingma
24
98
0
06 Nov 2020
Improving Prosody Modelling with Cross-Utterance BERT Embeddings for
  End-to-end Speech Synthesis
Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis
Guanghui Xu
Wei Song
Zhengchen Zhang
Chao Zhang
Xiaodong He
Bowen Zhou
13
50
0
06 Nov 2020
Paralinguistic Privacy Protection at the Edge
Paralinguistic Privacy Protection at the Edge
Ranya Aloufi
Hamed Haddadi
David E. Boyle
17
14
0
04 Nov 2020
Previous
123...1056789
Next