v1v2 (latest)

Towards General-Purpose Text-Instruction-Guided Voice Conversion

25 September 2023

Hung-yi Lee

Papers citing "Towards General-Purpose Text-Instruction-Guided Voice Conversion"

30 / 30 papers shown

Title
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation Tianrui Wang Long Zhou Zi-Hua Zhang Yu-Huan Wu Shujie Liu Yashesh Gaur Zhuo Chen Jinyu Li Furu Wei 81 105 0 25 May 2023
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt Dongchao Yang Songxiang Liu Rongjie Huang Chao Weng Helen Meng DiffM VLM 82 99 0 31 Jan 2023
AudioGen: Textually Guided Audio Generation Felix Kreuk Gabriel Synnaeve Adam Polyak Uriel Singer Alexandre Défossez Jade Copet Devi Parikh Yaniv Taigman Yossi Adi DiffM 99 309 0 30 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation Zalan Borsos Raphaël Marinier Damien Vincent Eugene Kharitonov Olivier Pietquin ... Dominik Roblek O. Teboul David Grangier Marco Tagliasacchi Neil Zeghidour AuLLM 161 615 0 07 Sep 2022
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks Kai-Wei Chang Wei-Cheng Tseng Shang-Wen Li Hung-yi Lee 78 23 0 31 Mar 2022
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning Qiqi Wang Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao DRL 103 23 0 22 Feb 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone Edresson Casanova Julian Weber C. Shulby Arnaldo Cândido Júnior Eren Golge M. Ponti 232 414 0 04 Dec 2021
A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion Benjamin van Niekerk M. Carbonneau Julian Zaïdi Matthew Baas Hugo Seuté Herman Kamper DRL 90 122 0 03 Nov 2021
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations Hyeong-Seok Choi Juheon Lee W. Kim Jie Hwan Lee Hoon Heo Kyogu Lee 90 158 0 27 Oct 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning Shijun Wang Dimche Kostadinov Damian Borth 74 11 0 27 Oct 2021
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations Wen-Chin Huang Shu-Wen Yang Tomoki Hayashi Hung-yi Lee Shinji Watanabe Tomoki Toda 71 40 0 12 Oct 2021
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion Yinghao Aaron Li A. Zare N. Mesgarani 84 101 0 21 Jul 2021
Emotional Voice Conversion: Theory, Databases and ESD Kun Zhou Berrak Sisman Rui Liu Haizhou Li 113 179 0 31 May 2021
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations Jheng-hao Lin Yist Y. Lin C. Chien Hung-yi Lee 124 56 0 07 Apr 2021
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and Fusing Fine-Grained Voice Fragments With Attention Yist Y. Lin C. Chien Jheng-hao Lin Hung-yi Lee Lin-Shan Lee 51 79 0 27 Oct 2020
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations Wen-Chin Huang Yi-Chiao Wu Tomoki Hayashi Tomoki Toda BDL 91 38 0 23 Oct 2020
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 74 82 0 22 Oct 2020
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion Yi Zhao Wen-Chin Huang Xiaohai Tian Junichi Yamagishi Rohan Kumar Das Tomi Kinnunen Zhenhua Ling Tomoki Toda 83 210 0 28 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning Berrak Sisman Junichi Yamagishi Simon King Haizhou Li BDL 111 326 0 09 Aug 2020
F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder Kaizhi Qian Zeyu Jin M. Hasegawa-Johnson G. J. Mysore 65 107 0 15 Apr 2020
Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders Yin-Jyun Luo Chin-Chen Hsu Kat R. Agres Dorien Herremans DRL 46 47 0 03 Dec 2019
StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 74 143 0 29 Jul 2019
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss Kaizhi Qian Yang Zhang Shiyu Chang Xuesong Yang M. Hasegawa-Johnson 84 467 0 14 May 2019
CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 65 261 0 09 Apr 2019
Nonparallel Emotional Speech Conversion Jian Gao Deep Chakraborty H. Tembine Olaitan Olaleye 56 69 0 03 Nov 2018
Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders Wen-Chin Huang Hsin-Te Hwang Yu-Huai Peng Yu Tsao H. Wang 64 43 0 29 Aug 2018
The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods Jaime Lorenzo-Trueba Junichi Yamagishi Tomoki Toda Daisuke Saito F. Villavicencio Tomi Kinnunen Zhenhua Ling 61 321 0 12 Apr 2018
High-quality nonparallel voice conversion based on cycle-consistent adversarial network Fuming Fang Junichi Yamagishi Isao Echizen Jaime Lorenzo-Trueba GAN 52 136 0 02 Apr 2018
Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks Chin-Cheng Hsu Hsin-Te Hwang Yi-Chiao Wu Yu Tsao H. Wang DRL 85 314 0 04 Apr 2017
Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder Chin-Cheng Hsu Hsin-Te Hwang Yi-Chiao Wu Yu Tsao H. Wang 89 304 0 13 Oct 2016