CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages

27 March 2019

Papers citing "CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages"

23 / 23 papers shown

Title
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation Ji-Hoon Kim Hong-Sun Yang Yoon-Cheol Ju Il-Hwan Kim Byeong-Yeol Kim Joon Son Chung BDL 54 0 0 31 Dec 2024
Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation Yingting Li Ambuj Mehrish Bryan Chew Bo Cheng Soujanya Poria 40 0 0 25 Jun 2024
Low-Resource Text-to-Speech Using Specific Data and Noise Augmentation K. Lakshminarayana C. Dittmar N. Pia Emanuel Habets 34 0 0 16 Jun 2023
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining Takaaki Saeki Soumi Maiti Xinjian Li Shinji Watanabe Shinnosuke Takamichi Hiroshi Saruwatari 32 18 0 30 Jan 2023
German Phoneme Recognition with Text-to-Phoneme Data Augmentation Dojun Park Seohyun Park 16 0 0 24 Nov 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations Paul-Ambroise Duquenne Hongyu Gong Ning Dong Jingfei Du Ann Lee Vedanuj Goswani Changhan Wang J. Pino Benoît Sagot Holger Schwenk 45 34 0 08 Nov 2022
An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space Jihwan Lee Jaesung Bae Seongkyu Mun Heejin Choi Joun Yeop Lee Hoon-Young Cho Chanwoo Kim 32 2 0 06 Nov 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History Yuto Nishimura Yuki Saito Shinnosuke Takamichi Kentaro Tachibana Hiroshi Saruwatari AI4TS 27 7 0 16 Jun 2022
GWA: A Large High-Quality Acoustic Dataset for Audio Processing Zhenyu Tang R. Aralikatti Anton Ratnarajah Tianyi Zhou 35 31 0 04 Apr 2022
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features Florian Lux Ngoc Thang Vu 28 29 0 07 Mar 2022
Spanish and English Phoneme Recognition by Training on Simulated Classroom Audio Recordings of Collaborative Learning Environments Mario Esparza 27 0 0 21 Feb 2022
Textless Speech-to-Speech Translation on Real Data Ann Lee Hongyu Gong Paul-Ambroise Duquenne Holger Schwenk Peng-Jen Chen ... Sravya Popuri Yossi Adi J. Pino Jiatao Gu Wei-Ning Hsu 31 143 0 15 Dec 2021
A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion Benjamin van Niekerk M. Carbonneau Julian Zaïdi Matthew Baas Hugo Seuté Herman Kamper DRL 27 111 0 03 Nov 2021
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Hyeong-Seok Choi Juheon Lee W. Kim Jie Hwan Lee Hoon Heo Kyogu Lee 37 151 0 27 Oct 2021
Assessing Evaluation Metrics for Speech-to-Speech Translation Elizabeth Salesky Julian Mäder Severin Klinger 35 14 0 26 Oct 2021
From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation Danni Liu Changhan Wang Hongyu Gong Xutai Ma Yun Tang J. Pino 25 4 0 15 Oct 2021
Scribosermo: Fast Speech-to-Text models for German and other Languages Daniel Bermuth Alexander Poeppel W. Reif 29 9 0 15 Oct 2021
Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis Mu Yang Shaojin Ding Tianlong Chen Tong Wang Zhangyang Wang CLL 30 5 0 09 Oct 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 353 0 29 Jun 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 26 24 0 20 Apr 2021
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS Wen-Chin Huang Tomoki Hayashi Shinji Watanabe T. Toda DRL 15 39 0 06 Oct 2020
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis Yusuke Yasuda Xin Wang Junichi Yamagishi AI4TS 22 31 0 20 May 2020
Learning pronunciation from a foreign language in speech synthesis networks Younggun Lee Suwon Shon Taesu Kim 22 26 0 23 Nov 2018