ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.13066
  4. Cited By
Revisiting Over-Smoothness in Text to Speech

Revisiting Over-Smoothness in Text to Speech

26 February 2022
Yi Ren
Xu Tan
Tao Qin
Zhou Zhao
Tie-Yan Liu
ArXivPDFHTML

Papers citing "Revisiting Over-Smoothness in Text to Speech"

10 / 10 papers shown
Title
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation
Improving Robustness of Diffusion-Based Zero-Shot Speech Synthesis via Stable Formant Generation
C. Han
Seokgi Lee
Gyuhyeon Nam
Gyeongsu Chae
DiffM
135
0
0
14 Sep 2024
Fake it to make it: Using synthetic data to remedy the data shortage in
  joint multimodal speech-and-gesture synthesis
Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis
Shivam Mehta
Anna Deichler
Jim O'Regan
Birger Moëll
Jonas Beskow
G. Henter
Simon Alexanderson
46
4
0
30 Apr 2024
StreamVoice: Streamable Context-Aware Language Modeling for Real-time
  Zero-Shot Voice Conversion
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
Zhichao Wang
Yuan-Jui Chen
Xinsheng Wang
Lei Xie
Yuping Wang
22
6
0
19 Jan 2024
Stochastic Pitch Prediction Improves the Diversity and Naturalness of
  Speech in Glow-TTS
Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS
Sewade Ogun
Vincent Colotte
Emmanuel Vincent
DiffM
27
4
0
28 May 2023
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Ruiqi Li
Rongjie Huang
Lichao Zhang
Jinglin Liu
Zhou Zhao
30
4
0
08 May 2023
Towards Building Text-To-Speech Systems for the Next Billion Users
Towards Building Text-To-Speech Systems for the Next Billion Users
Gokul Karthik Kumar
V. PraveenS.
Pratyush Kumar
Mitesh M. Khapra
Karthik Nandakumar
38
18
0
17 Nov 2022
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS
Kun Song
Jian Cong
Xinsheng Wang
Yongmao Zhang
Linfu Xie
Ning Jiang
Haiying Wu
22
0
0
31 Oct 2022
ProDiff: Progressive Fast Diffusion Model For High-Quality
  Text-to-Speech
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
44
193
0
13 Jul 2022
High Fidelity Speech Synthesis with Adversarial Networks
High Fidelity Speech Synthesis with Adversarial Networks
Mikolaj Binkowski
Jeff Donahue
Sander Dieleman
Aidan Clark
Erich Elsen
Norman Casagrande
Luis C. Cobo
Karen Simonyan
223
239
0
25 Sep 2019
Image-to-Image Translation with Conditional Adversarial Networks
Image-to-Image Translation with Conditional Adversarial Networks
Phillip Isola
Jun-Yan Zhu
Tinghui Zhou
Alexei A. Efros
SSeg
212
19,450
0
21 Nov 2016
1