VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment

12 June 2024

Shujie Liu

Yanming Qian

Sheng Zhao

Jinyu Li

Papers citing "VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment"

3 / 3 papers shown

Title
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching Y. Chen Zhikang Niu Ziyang Ma Keqi Deng Chunhui Wang Jian Zhao Kai Yu Xie Chen 33 52 0 09 Oct 2024
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech Jaehyeon Kim Keon Lee Seungjun Chung Jaewoong Cho 74 39 0 03 Apr 2024
Zero-Shot Text-to-Image Generation Aditya A. Ramesh Mikhail Pavlov Gabriel Goh Scott Gray Chelsea Voss Alec Radford Mark Chen Ilya Sutskever VLM 255 4,781 0 24 Feb 2021