Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

12 May 2020

Papers citing "Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis"

37 / 37 papers shown

Title
Muyan-TTS: A Trainable Text-to-Speech Model Optimized for Podcast Scenarios with a $50K Budget$ Xin Li Kaikai Jia Hao Sun Jun Dai Z. L. Jiang 131 0 0 27 Apr 2025
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning Shuai Wang Zheng-Shou Chen Kong Aik Lee Yan-min Qian Haizhou Li 39 4 0 21 Jul 2024
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment Paarth Neekhara Shehzeen Samarah Hussain Subhankar Ghosh Jason Chun Lok Li Rafael Valle Rohan Badlani Boris Ginsburg 55 11 0 25 Jun 2024
Cognitively Inspired Energy-Based World Models Alexi Gladstone Ganesh Nanduru Md. Mofijul Islam Aman Chadha Jundong Li Tariq Iqbal 39 0 0 13 Jun 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model Xiangyu Zhang Daijiao Liu Hexin Liu Qiquan Zhang Hanyu Meng Leibny Paola García Chng Eng Siong Lina Yao DiffM 25 2 0 16 Feb 2024
Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech Abhinav Garg Jiyeon Kim Sushil Khyalia Chanwoo Kim Dhananjaya N. Gowda 25 2 0 19 Jan 2024
Creating New Voices using Normalizing Flows Piotr Bilinski Thomas Merritt Abdelhamid Ezzerg Kamil Pokora Sebastian Cygert K. Yanagisawa Roberto Barra-Chicote Daniel Korzekwa 20 17 0 22 Dec 2023
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes Seongho Joo Hyukhun Koh Kyomin Jung DiffM 47 0 0 23 Oct 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design Jungil Kong Jihoon Park Beomjeong Kim Jeongmin Kim Dohee Kong Sangjin Kim 37 35 0 31 Jul 2023
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus Yuma Koizumi Heiga Zen Shigeki Karita Yifan Ding Kohei Yatabe Nobuyuki Morioka M. Bacchiani Yu Zhang Wei Han Ankur Bapna 43 66 0 30 May 2023
Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS Sewade Ogun Vincent Colotte Emmanuel Vincent DiffM 29 4 0 28 May 2023
U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech Xin Jing Yi Chang Zijiang Yang Jiang-jian Xie Andreas Triantafyllopoulos Bjoern W. Schuller 41 10 0 22 May 2023
AI-Synthesized Voice Detection Using Neural Vocoder Artifacts Chengzhe Sun Shan Jia Shuwei Hou Siwei Lyu 32 38 0 25 Apr 2023
Affective social anthropomorphic intelligent system Md. Adyelullahil Mamun Hasnat Md. Abdullah Md. Golam Rabiul Alam Muhammad Mehedi Hassan Md. Zia Uddin 17 1 0 19 Apr 2023
Do Prosody Transfer Models Transfer Prosody? A. Sigurgeirsson Simon King DiffM 4 7 0 07 Mar 2023
An investigation into the adaptability of a diffusion-based TTS model Haolin Chen Philip N. Garner DiffM 36 1 0 03 Mar 2023
Varianceflow: High-Quality and Controllable Text-to-Speech using Variance Information via Normalizing Flow Yoonhyung Lee Jinhyeok Yang Kyomin Jung 22 6 0 27 Feb 2023
Exposing AI-Synthesized Human Voices Using Neural Vocoder Artifacts Chengzhe Sun Shan Jia Shuwei Hou Ehab AlBadawy Siwei Lyu 130 3 0 18 Feb 2023
OverFlow: Putting flows on top of neural transducers for better TTS Shivam Mehta Ambika Kirkland Harm Lameris Jonas Beskow Éva Székely G. Henter AI4TS 39 12 0 13 Nov 2022
Cover Reproducible Steganography via Deep Generative Models Kejiang Chen Hang Zhou Yaofei Wang Meng Li Weiming Zhang Neng H. Yu DiffM 28 9 0 26 Oct 2022
Detecting Synthetic Speech Manipulation in Real Audio Recordings M. Rahman M. Graciarena Diego Castán Chris Cobo-Kroenke Mitchell McLaren A. Lawson AAML 25 9 0 15 Sep 2022
The Role of Vocal Persona in Natural and Synthesized Speech Camille Noufi Lloyd May J. Berger 25 2 0 06 Sep 2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis Yinghao Aaron Li Cong Han N. Mesgarani 36 38 0 30 May 2022
Distribution augmentation for low-resource expressive text-to-speech Mateusz Lajszczak Animesh Prasad Arent van Korlaar Bajibabu Bollepalli A. Bonafonte ... M. Nicolis Alexis Moinet Thomas Drugman Trevor Wood Elena Sokolova 30 7 0 13 Feb 2022
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks E. Hortal Rodrigo Brechard Alarcia GAN 23 2 0 06 Oct 2021
Integrated Speech and Gesture Synthesis Siyang Wang Simon Alexanderson Joakim Gustafson Jonas Beskow G. Henter Éva Székely 37 19 0 25 Aug 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 352 0 29 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim Jungil Kong Juhee Son DRL 77 842 0 11 Jun 2021
Giving Commands to a Self-Driving Car: How to Deal with Uncertain Situations? Thierry Deruyttere Victor Milewski Marie-Francine Moens 30 15 0 08 Jun 2021
Exploring emotional prototypes in a high dimensional TTS latent space Pol van Rijn Silvan Mertes Dominik Schiller Peter M. C. Harrison P. Larrouy-Maestri Elisabeth André Nori Jacoby 28 12 0 05 May 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 26 24 0 20 Apr 2021
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention Peng Liu Yuewen Cao Songxiang Liu Na Hu Guangzhi Li Chao Weng Dan Su 36 22 0 12 Feb 2021
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis Ron J. Weiss RJ Skerry-Ryan Eric Battenberg Soroosh Mariooryad Diederik P. Kingma 21 97 0 06 Nov 2020
DiffWave: A Versatile Diffusion Model for Audio Synthesis Zhifeng Kong Ming-Yu Liu Jiaji Huang Kexin Zhao Bryan Catanzaro DiffM BDL 34 1,392 0 21 Sep 2020
FastPitch: Parallel Text-to-speech with Pitch Prediction Adrian Lañcucki 27 332 0 11 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search Jaehyeon Kim Sungwon Kim Jungil Kong Sungroh Yoon 31 473 0 22 May 2020
High Fidelity Speech Synthesis with Adversarial Networks Mikolaj Binkowski Jeff Donahue Sander Dieleman Aidan Clark Erich Elsen Norman Casagrande Luis C. Cobo Karen Simonyan 232 239 0 25 Sep 2019