ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.14324
  4. Cited By
Towards General-Purpose Text-Instruction-Guided Voice Conversion
v1v2 (latest)

Towards General-Purpose Text-Instruction-Guided Voice Conversion

25 September 2023
Chun-Yi Kuan
Chen-An Li
Tsung-Yuan Hsu
Tzu-Quan Lin
Ho-Lam Chung
Kai-Wei Chang
Shuo-yiin Chang
Hung-yi Lee
ArXiv (abs)PDFHTML

Papers citing "Towards General-Purpose Text-Instruction-Guided Voice Conversion"

30 / 30 papers shown
Title
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis,
  and Translation
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Tianrui Wang
Long Zhou
Zi-Hua Zhang
Yu-Huan Wu
Shujie Liu
Yashesh Gaur
Zhuo Chen
Jinyu Li
Furu Wei
81
105
0
25 May 2023
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with
  Natural Language Style Prompt
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt
Dongchao Yang
Songxiang Liu
Rongjie Huang
Chao Weng
Helen Meng
DiffMVLM
82
99
0
31 Jan 2023
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
99
309
0
30 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
161
615
0
07 Sep 2022
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken
  Language Model for Speech Processing Tasks
SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks
Kai-Wei Chang
Wei-Cheng Tseng
Shang-Wen Li
Hung-yi Lee
78
23
0
31 Mar 2022
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised
  Learning
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning
Qiqi Wang
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
DRL
103
23
0
22 Feb 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice
  Conversion for everyone
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
232
414
0
04 Dec 2021
A Comparison of Discrete and Soft Speech Units for Improved Voice
  Conversion
A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
Benjamin van Niekerk
M. Carbonneau
Julian Zaïdi
Matthew Baas
Hugo Seuté
Herman Kamper
DRL
90
122
0
03 Nov 2021
Neural Analysis and Synthesis: Reconstructing Speech from
  Self-Supervised Representations
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
90
158
0
27 Oct 2021
Zero-shot Voice Conversion via Self-supervised Prosody Representation
  Learning
Zero-shot Voice Conversion via Self-supervised Prosody Representation Learning
Shijun Wang
Dimche Kostadinov
Damian Borth
74
11
0
27 Oct 2021
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised
  Speech Representations
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
Hung-yi Lee
Shinji Watanabe
Tomoki Toda
71
40
0
12 Oct 2021
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for
  Natural-Sounding Voice Conversion
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Yinghao Aaron Li
A. Zare
N. Mesgarani
84
101
0
21 Jul 2021
Emotional Voice Conversion: Theory, Databases and ESD
Emotional Voice Conversion: Theory, Databases and ESD
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
113
179
0
31 May 2021
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised
  Pretrained Representations
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations
Jheng-hao Lin
Yist Y. Lin
C. Chien
Hung-yi Lee
124
56
0
07 Apr 2021
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and
  Fusing Fine-Grained Voice Fragments With Attention
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and Fusing Fine-Grained Voice Fragments With Attention
Yist Y. Lin
C. Chien
Jheng-hao Lin
Hung-yi Lee
Lin-Shan Lee
51
79
0
27 Oct 2020
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised
  Discrete Speech Representations
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang
Yi-Chiao Wu
Tomoki Hayashi
Tomoki Toda
BDL
91
38
0
23 Oct 2020
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram
  Conversion
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
74
82
0
22 Oct 2020
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and
  cross-lingual voice conversion
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
Yi Zhao
Wen-Chin Huang
Xiaohai Tian
Junichi Yamagishi
Rohan Kumar Das
Tomi Kinnunen
Zhenhua Ling
Tomoki Toda
83
210
0
28 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
111
326
0
09 Aug 2020
F0-consistent many-to-many non-parallel voice conversion via conditional
  autoencoder
F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder
Kaizhi Qian
Zeyu Jin
M. Hasegawa-Johnson
G. J. Mysore
65
107
0
15 Apr 2020
Singing Voice Conversion with Disentangled Representations of Singer and
  Vocal Technique Using Variational Autoencoders
Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders
Yin-Jyun Luo
Chin-Chen Hsu
Kat R. Agres
Dorien Herremans
DRL
46
47
0
03 Dec 2019
StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice
  Conversion
StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
74
143
0
29 Jul 2019
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Kaizhi Qian
Yang Zhang
Shiyu Chang
Xuesong Yang
M. Hasegawa-Johnson
84
467
0
14 May 2019
CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion
CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
65
261
0
09 Apr 2019
Nonparallel Emotional Speech Conversion
Nonparallel Emotional Speech Conversion
Jian Gao
Deep Chakraborty
H. Tembine
Olaitan Olaleye
56
69
0
03 Nov 2018
Voice Conversion Based on Cross-Domain Features Using Variational Auto
  Encoders
Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders
Wen-Chin Huang
Hsin-Te Hwang
Yu-Huai Peng
Yu Tsao
H. Wang
64
43
0
29 Aug 2018
The Voice Conversion Challenge 2018: Promoting Development of Parallel
  and Nonparallel Methods
The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods
Jaime Lorenzo-Trueba
Junichi Yamagishi
Tomoki Toda
Daisuke Saito
F. Villavicencio
Tomi Kinnunen
Zhenhua Ling
61
321
0
12 Apr 2018
High-quality nonparallel voice conversion based on cycle-consistent
  adversarial network
High-quality nonparallel voice conversion based on cycle-consistent adversarial network
Fuming Fang
Junichi Yamagishi
Isao Echizen
Jaime Lorenzo-Trueba
GAN
52
136
0
02 Apr 2018
Voice Conversion from Unaligned Corpora using Variational Autoencoding
  Wasserstein Generative Adversarial Networks
Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks
Chin-Cheng Hsu
Hsin-Te Hwang
Yi-Chiao Wu
Yu Tsao
H. Wang
DRL
85
314
0
04 Apr 2017
Voice Conversion from Non-parallel Corpora Using Variational
  Auto-encoder
Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder
Chin-Cheng Hsu
Hsin-Te Hwang
Yi-Chiao Wu
Yu Tsao
H. Wang
89
304
0
13 Oct 2016
1