ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.15418
  4. Cited By
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

27 October 2022
Jingyi Li
Weiping Tu
Li Xiao
ArXivPDFHTML

Papers citing "FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion"

32 / 32 papers shown
Title
Predicting Turn-Taking and Backchannel in Human-Machine Conversations Using Linguistic, Acoustic, and Visual Signals
Predicting Turn-Taking and Backchannel in Human-Machine Conversations Using Linguistic, Acoustic, and Visual Signals
Yuxin Lin
Yinglin Zheng
Ming Zeng
Wangzheng Shi
71
0
0
19 May 2025
Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion
Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion
Na Li
Chuke Wang
Yu Gu
Zhifeng Li
114
0
0
11 Apr 2025
SpecWav-Attack: Leveraging Spectrogram Resizing and Wav2Vec 2.0 for Attacking Anonymized Speech
SpecWav-Attack: Leveraging Spectrogram Resizing and Wav2Vec 2.0 for Attacking Anonymized Speech
Yuqi Li
Yuanzhong Zheng
Zhongtian Guo
Yaoxuan Wang
Jianjun Yin
Haojun Fei
47
1
0
10 Jan 2025
Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS
Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS
T. Nguyen
Seymanur Akti
Ngoc-Quan Pham
A. Waibel
70
1
0
19 Oct 2024
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
SF-Speech: Straightened Flow for Zero-Shot Voice Clone
Xuyuan Li
Zengqiang Shang
Hua Hua
Peiyang Shi
Chen Yang
Li Wang
Pengyuan Zhang
120
2
0
16 Oct 2024
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing
Chunyu Qiang
Wang Geng
Yi Zhao
Ruibo Fu
Tao Wang
...
Chen Zhang
Hao Che
L. Wang
Jianwu Dang
J. Tao
AI4TS
77
0
0
11 Aug 2024
PITCH: AI-assisted Tagging of Deepfake Audio Calls using Challenge-Response
PITCH: AI-assisted Tagging of Deepfake Audio Calls using Challenge-Response
Govind Mittal
Arthur Jakobsson
Kelly O. Marshall
Chinmay Hegde
Nasir Memon
75
0
0
28 Feb 2024
GlowVC: Mel-spectrogram space disentangling model for
  language-independent text-free voice conversion
GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion
Magdalena Proszewska
Grzegorz Beringer
Daniel Sáez-Trigueros
Thomas Merritt
Abdelhamid Ezzerg
Roberto Barra-Chicote
46
6
0
04 Jul 2022
SpeechSplit 2.0: Unsupervised speech disentanglement for voice
  conversion Without tuning autoencoder Bottlenecks
SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Chak Ho Chan
Kaizhi Qian
Yang Zhang
M. Hasegawa-Johnson
DRL
25
48
0
26 Mar 2022
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice
  Conversion for everyone
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
217
408
0
04 Dec 2021
Neural Analysis and Synthesis: Reconstructing Speech from
  Self-Supervised Representations
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
74
158
0
27 Oct 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
239
1,857
0
26 Oct 2021
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised
  Speech Representations
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
Hung-yi Lee
Shinji Watanabe
Tomoki Toda
61
40
0
12 Oct 2021
Large-scale Self-Supervised Speech Representation Learning for Automatic
  Speaker Verification
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification
Zhengyang Chen
Sanyuan Chen
Yu-Huan Wu
Yao Qian
Chengyi Wang
Shujie Liu
Y. Qian
Michael Zeng
SSL
44
128
0
12 Oct 2021
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised
  Speech Representation Disentanglement for One-shot Voice Conversion
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion
Disong Wang
Liqun Deng
Y. Yeung
Xiao Chen
Xunying Liu
Helen Meng
DRL
45
139
0
18 Jun 2021
Conditional Variational Autoencoder with Adversarial Learning for
  End-to-End Text-to-Speech
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Jaehyeon Kim
Jungil Kong
Juhee Son
DRL
114
882
0
11 Jun 2021
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and
  Adaptive Instance Normalization
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization
Yen-Hao Chen
Da-Yi Wu
Tsung-Han Wu
Hung-yi Lee
71
108
0
31 Oct 2020
Seen and Unseen emotional style transfer for voice conversion with a new
  emotional speech dataset
Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
66
191
0
28 Oct 2020
HiFi-GAN: Generative Adversarial Networks for Efficient and High
  Fidelity Speech Synthesis
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Jungil Kong
Jaehyeon Kim
Jaekyoung Bae
177
1,931
0
12 Oct 2020
Transfer Learning from Speech Synthesis to Voice Conversion with
  Non-Parallel Training Data
Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data
Mingyang Zhang
Yi Zhou
Li Zhao
Haizhou Li
38
53
0
30 Sep 2020
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence
  Modeling
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Songxiang Liu
Yuewen Cao
Disong Wang
Xixin Wu
Xunying Liu
Helen Meng
BDL
69
90
0
06 Sep 2020
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and
  cross-lingual voice conversion
Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
Yi Zhao
Wen-Chin Huang
Xiaohai Tian
Junichi Yamagishi
Rohan Kumar Das
Tomi Kinnunen
Zhenhua Ling
Tomoki Toda
76
207
0
28 Aug 2020
Unsupervised Cross-lingual Representation Learning for Speech
  Recognition
Unsupervised Cross-lingual Representation Learning for Speech Recognition
Alexis Conneau
Alexei Baevski
R. Collobert
Abdel-rahman Mohamed
Michael Auli
SSL
144
780
0
24 Jun 2020
VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net
  architecture
VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net architecture
Da-Yi Wu
Yen-Hao Chen
Hung-yi Lee
54
101
0
07 Jun 2020
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice
  Conversion without Parallel Data
Cotatron: Transcription-Guided Speech Encoder for Any-to-Many Voice Conversion without Parallel Data
Seung-won Park
Doo-young Kim
Myun-chul Joe
63
42
0
07 May 2020
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss
Kaizhi Qian
Yang Zhang
Shiyu Chang
Xuesong Yang
M. Hasegawa-Johnson
78
465
0
14 May 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
Heiga Zen
Viet Dang
R. Clark
Yu Zhang
Ron J. Weiss
Ye Jia
Zhiwen Chen
Yonghui Wu
102
951
0
05 Apr 2019
Collapsed speech segment detection and suppression for WaveNet vocoder
Collapsed speech segment detection and suppression for WaveNet vocoder
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Hayashi
Patrick Lumban Tobing
Tomoki Toda
35
25
0
30 Apr 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in
  End-to-End Speech Synthesis
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Yuxuan Wang
Daisy Stanton
Yu Zhang
RJ Skerry-Ryan
Eric Battenberg
Joel Shor
Y. Xiao
Fei Ren
Ye Jia
Rif A. Saurous
64
826
0
23 Mar 2018
Least Squares Generative Adversarial Networks
Least Squares Generative Adversarial Networks
Xudong Mao
Qing Li
Haoran Xie
Raymond Y. K. Lau
Zhen Wang
Stephen Paul Smolley
GAN
329
4,573
0
13 Nov 2016
Faster Eigenvector Computation via Shift-and-Invert Preconditioning
Faster Eigenvector Computation via Shift-and-Invert Preconditioning
Dan Garber
Laurent Dinh
Chi Jin
Jascha Narain Sohl-Dickstein
Samy Bengio
Praneeth Netrapalli
Aaron Sidford
263
78
0
26 May 2016
Autoencoding beyond pixels using a learned similarity metric
Autoencoding beyond pixels using a learned similarity metric
Anders Boesen Lindbo Larsen
Søren Kaae Sønderby
Hugo Larochelle
Ole Winther
GAN
168
2,068
0
31 Dec 2015
1