ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.10763
  4. Cited By
CLAPSpeech: Learning Prosody from Text Context with Contrastive
  Language-Audio Pre-training

CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training

18 May 2023
Zhe Ye
Rongjie Huang
Yi Ren
Ziyue Jiang
Jinglin Liu
Jinzheng He
Xiang Yin
Zhou Zhao
    CLIP
ArXiv (abs)PDFHTML

Papers citing "CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training"

24 / 24 papers shown
Title
Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis
Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech Synthesis
Rui Liu
Zhenqi Jia
F. Bao
Hong Li
73
2
0
11 Jan 2025
ProDiff: Progressive Fast Diffusion Model For High-Quality
  Text-to-Speech
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech
Rongjie Huang
Zhou Zhao
Huadai Liu
Jinglin Liu
Chenye Cui
Yi Ren
DiffM
87
200
0
13 Jul 2022
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
SyntaSpeech: Syntax-Aware Generative Adversarial Text-to-Speech
Zhenhui Ye
Zhou Zhao
Yi Ren
Leilei Gan
64
28
0
25 Apr 2022
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech
  Synthesis
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
Rongjie Huang
Max W. Y. Lam
Jun Wang
Dan Su
Dong Yu
Yi Ren
Zhou Zhao
DiffM
66
170
0
21 Apr 2022
A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech
  Synthesis and Editing
A3^33T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
Richard He Bai
Renjie Zheng
Junkun Chen
Xintong Li
Mingbo Ma
Liang Huang
77
53
0
18 Mar 2022
ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in
  Text-to-Speech
ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech
Yi Ren
Ming Lei
Zhiying Huang
Shi-Rui Zhang
Qian Chen
Zhijie Yan
Zhou Zhao
78
43
0
16 Feb 2022
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech
  Recognition
WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
Binbin Zhang
Hang Lv
Pengcheng Guo
Qijie Shao
Chao Yang
...
Hui Bu
Xiaoyu Chen
Chenchen Zeng
Di Wu
Zhendong Peng
64
231
0
07 Oct 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
111
359
0
29 Jun 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked
  Prediction of Hidden Units
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
180
2,989
0
14 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for
  End-to-End Text-to-Speech
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Jaehyeon Kim
Jungil Kong
Juhee Son
DRL
128
894
0
11 Jun 2021
Speech BERT Embedding For Improving Prosody in Neural TTS
Speech BERT Embedding For Improving Prosody in Neural TTS
Liping Chen
Yan Deng
Xi Wang
Frank Soong
Lei He
67
23
0
08 Jun 2021
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Jinglin Liu
Chengxi Li
Yi Ren
Feiyang Chen
Zhou Zhao
DiffM
119
267
0
06 May 2021
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
Ye Jia
Heiga Zen
Jonathan Shen
Yu Zhang
Yonghui Wu
SSL
85
84
0
28 Mar 2021
EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture
EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture
Chenfeng Miao
Shuang Liang
Zhencheng Liu
Minchuan Chen
Jun Ma
Shaojun Wang
Jing Xiao
62
38
0
07 Dec 2020
GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech
  Synthesis
GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech Synthesis
Rui Liu
Berrak Sisman
Haizhou Li
82
25
0
23 Oct 2020
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Parallel Tacotron: Non-Autoregressive and Controllable TTS
Isaac Elias
Heiga Zen
Jonathan Shen
Yu Zhang
Ye Jia
Ron J. Weiss
Yonghui Wu
DRL
68
103
0
22 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High
  Fidelity Speech Synthesis
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
179
1,944
0
12 Oct 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
285
5,801
0
20 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Yi Ren
Chenxu Hu
Xu Tan
Tao Qin
Sheng Zhao
Zhou Zhao
Tie-Yan Liu
105
1,401
0
08 Jun 2020
End-to-End Adversarial Text-to-Speech
End-to-End Adversarial Text-to-Speech
Jeff Donahue
Sander Dieleman
Mikolaj Binkowski
Erich Elsen
Karen Simonyan
70
186
0
05 Jun 2020
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment
  Search
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim
Sungwon Kim
Jungil Kong
Sungroh Yoon
100
495
0
22 May 2020
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
104
956
0
05 Apr 2019
Semi-Supervised Training for Improving Data Efficiency in End-to-End
  Speech Synthesis
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis
Yu-An Chung
Yuxuan Wang
Wei-Ning Hsu
Yu Zhang
RJ Skerry-Ryan
79
117
0
30 Aug 2018
Transfer Learning from Speaker Verification to Multispeaker
  Text-To-Speech Synthesis
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Zhiwen Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
256
834
0
12 Jun 2018
1