Diversity-based core-set selection for text-to-speech with linguistic
and acoustic features

Diversity-based core-set selection for text-to-speech with linguistic and acoustic features

15 September 2023

Shinnosuke Takamichi

Hiroshi Saruwatari

ArXiv (abs)PDF HTML

Papers citing "Diversity-based core-set selection for text-to-speech with linguistic and acoustic features"

15 / 15 papers shown

Title
Robust Speech Recognition via Large-Scale Weak Supervision Alec Radford Jong Wook Kim Tao Xu Greg Brockman C. McLeavey Ilya Sutskever OffRL 201 3,732 0 06 Dec 2022
Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection Kentaro Seki Shinnosuke Takamichi Takaaki Saeki Hiroshi Saruwatari 80 7 0 26 Oct 2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022 Takaaki Saeki Detai Xin Wataru Nakata Tomoki Koriyama Shinnosuke Takamichi Hiroshi Saruwatari 109 217 0 05 Apr 2022
The VoiceMOS Challenge 2022 Wen-Chin Huang Erica Cooper Yu Tsao Hsin-Min Wang Tomoki Toda Junichi Yamagishi 65 108 0 21 Mar 2022
JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification Shinnosuke Takamichi Ludwig Kurzinger Takaaki Saeki Sayaka Shiota Shinji Watanabe 39 25 0 17 Dec 2021
GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio Guoguo Chen Shuzhou Chai Guan-Bo Wang Jiayu Du Weiqiang Zhang ... Xuchen Yao Yongqing Wang Yujun Wang Zhao You Zhiyong Yan 116 383 0 13 Jun 2021
Controllable Emotion Transfer For End-to-End Speech Synthesis Tao Li Shan Yang Liumeng Xue Lei Xie 67 74 0 17 Nov 2020
AISHELL-3: A Multi-speaker Mandarin TTS Corpus and the Baselines Yao Shi Hui Bu Xin Xu Shaojing Zhang Ming Li 83 223 0 22 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong Jaehyeon Kim Jaekyoung Bae 179 1,947 0 12 Oct 2020
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations Alexei Baevski Henry Zhou Abdel-rahman Mohamed Michael Auli SSL 297 5,837 0 20 Jun 2020
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Yi Ren Chenxu Hu Xu Tan Tao Qin Sheng Zhao Zhou Zhao Tie-Yan Liu 105 1,406 0 08 Jun 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks Nils Reimers Iryna Gurevych 1.3K 12,301 0 27 Aug 2019
JVS corpus: free Japanese multi-speaker voice corpus Shinnosuke Takamichi Kentaro Mitsui Yuki Saito Tomoki Koriyama Naoko Tanji Hiroshi Saruwatari 51 72 0 17 Aug 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech Heiga Zen Viet Dang R. Clark Yu Zhang Ron J. Weiss Ye Jia Zhiwen Chen Yonghui Wu 104 959 0 05 Apr 2019
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions Jonathan Shen Ruoming Pang Ron J. Weiss M. Schuster Navdeep Jaitly ... Yuxuan Wang RJ Skerry-Ryan Rif A. Saurous Yannis Agiomyrgiannakis Yonghui Wu 85 2,703 0 16 Dec 2017