Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2110.14513
Cited By
Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations
27 October 2021
Hyeong-Seok Choi
Juheon Lee
W. Kim
Jie Hwan Lee
Hoon Heo
Kyogu Lee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations"
50 / 101 papers shown
Title
Audio Deepfake Detection: A Survey
Jiangyan Yi
Chenglong Wang
J. Tao
Xiaohui Zhang
Chu Yuan Zhang
Yan Zhao
35
43
0
29 Aug 2023
HierVST: Hierarchical Adaptive Zero-shot Voice Style Transfer
Sang-Hoon Lee
Haram Choi
H. Oh
Seong-Whan Lee
BDL
23
9
0
30 Jul 2023
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Yinghao Aaron Li
Cong Han
N. Mesgarani
20
5
0
18 Jul 2023
What Do Self-Supervised Speech Models Know About Words?
Ankita Pasad
C. Chien
Shane Settle
Karen Livescu
SSL
33
26
0
30 Jun 2023
GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech
Yahuan Cong
Haoyu Zhang
Hao-Ping Lin
Shichao Liu
Chunfeng Wang
Yi Ren
Xiang Yin
Zejun Ma
25
1
0
27 Jun 2023
Toward Leveraging Pre-Trained Self-Supervised Frontends for Automatic Singing Voice Understanding Tasks: Three Case Studies
Yuya Yamamoto
25
2
0
22 Jun 2023
LM-VC: Zero-shot Voice Conversion via Speech Generation based on Language Models
Zhichao Wang
Yuan-Jui Chen
Linfu Xie
Qiao Tian
Yuping Wang
72
30
0
18 Jun 2023
HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models
Ji-Sang Hwang
Sang-Hoon Lee
Seong-Whan Lee
DiffM
25
8
0
12 Jun 2023
What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model
Mu Yang
R. Shekar
Okim Kang
John H. L. Hansen
15
10
0
10 Jun 2023
The Age of Synthetic Realities: Challenges and Opportunities
J. P. Cardenuto
Jing Yang
Rafael Padilha
Renjie Wan
Daniel Moreira
Haoliang Li
Shiqi Wang
Fernanda A. Andaló
Sébastien Marcel
Anderson de Rezende Rocha
DeLMO
42
29
0
09 Jun 2023
Make-A-Voice: Unified Voice Synthesis With Discrete Representation
Rongjie Huang
Chunlei Zhang
Yongqiang Wang
Dongchao Yang
Lu Liu
Zhenhui Ye
Ziyue Jiang
Chao Weng
Zhou Zhao
Dong Yu
DiffM
29
26
0
30 May 2023
DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Haram Choi
Sang-Hoon Lee
Seong-Whan Lee
DiffM
13
26
0
25 May 2023
Wav2SQL: Direct Generalizable Speech-To-SQL Parsing
Huadai Liu
Rongjie Huang
Jinzheng He
Gang Sun
Ran Shen
Xize Cheng
Zhou Zhao
31
3
0
21 May 2023
Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clustering
Heng-Jui Chang
Alexander H. Liu
James R. Glass
SSL
17
20
0
18 May 2023
Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation Based Voice Conversion
Xintao Zhao
Shuai Wang
Yang Chao
Zhiyong Wu
H. Meng
27
3
0
16 May 2023
Multi-level Temporal-channel Speaker Retrieval for Zero-shot Voice Conversion
Zhichao Wang
Liumeng Xue
Qiuqiang Kong
Linfu Xie
Yuan-Jui Chen
Qiao Tian
Yuping Wang
BDL
9
3
0
12 May 2023
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
Ruiqi Li
Rongjie Huang
Lichao Zhang
Jinglin Liu
Zhou Zhao
25
4
0
08 May 2023
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI
Chenshuang Zhang
Chaoning Zhang
Sheng Zheng
Mengchun Zhang
Maryam Qamar
Sung-Ho Bae
In So Kweon
DiffM
MedIm
43
64
0
23 Mar 2023
PITS: Variational Pitch Inference without Fundamental Frequency for End-to-End Pitch-controllable TTS
Junhyeok Lee
Wonbin Jung
Hyunjae Cho
Jaeyeon Kim
Jaehwan Kim
17
3
0
24 Feb 2023
ACE-VC: Adaptive and Controllable Voice Conversion using Explicitly Disentangled Self-supervised Speech Representations
Shehzeen Samarah Hussain
Paarth Neekhara
Jocelyn Huang
Jason Chun Lok Li
Boris Ginsburg
13
21
0
16 Feb 2023
Speaker-Independent Acoustic-to-Articulatory Speech Inversion
Peter Wu
Li-Wei Chen
Cheol Jun Cho
Shinji Watanabe
L. Goldstein
A. Black
Gopala K. Anumanchipalli
16
25
0
14 Feb 2023
Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Leyuan Qu
Taiha Li
C. Weber
Theresa Pekarek-Rosin
F. Ren
S. Wermter
21
8
0
14 Dec 2022
Towards trustworthy phoneme boundary detection with autoregressive model and improved evaluation metric
Hyeongju Kim
Hyeong-Seok Choi
6
2
0
13 Dec 2022
UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis
Yinjiao Lei
Shan Yang
Xinsheng Wang
Qicong Xie
Jixun Yao
Linfu Xie
Dan Su
DiffM
13
8
0
03 Dec 2022
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
Hyeong-Seok Choi
Jinhyeok Yang
Juheon Lee
Hyeongju Kim
18
46
0
17 Nov 2022
A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Li-Wei Chen
Shinji Watanabe
Alexander I. Rudnicky
18
6
0
12 Nov 2022
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features
Ziqian Ning
Qicong Xie
Pengcheng Zhu
Zhichao Wang
Liumeng Xue
Jixun Yao
Linfu Xie
Mengxiao Bi
19
16
0
09 Nov 2022
PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping
Junhyeok Lee
Seungu Han
Hyunjae Cho
Wonbin Jung
19
11
0
08 Nov 2022
Self-Supervised Learning for Speech Enhancement through Synthesis
Bryce Irvin
Marko Stamenovic
M. Kegler
Li-Chia Yang
35
18
0
04 Nov 2022
NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS
Dongchao Yang
Songxiang Liu
Jianwei Yu
Helin Wang
Chao Weng
Yuexian Zou
DiffM
VLM
33
18
0
04 Nov 2022
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
Jingyi Li
Weiping Tu
Li Xiao
46
96
0
27 Oct 2022
Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance
Yuan-Jui Chen
Ming Tu
Tang-Chun Li
Xin Li
Qiuqiang Kong
Jiaxin Li
Zhichao Wang
Qiao Tian
Yuping Wang
Yuxuan Wang
37
11
0
27 Oct 2022
NWPU-ASLP System for the VoicePrivacy 2022 Challenge
Jixun Yao
Qing Wang
Li Lyna Zhang
Pengcheng Guo
Yuhao Liang
Linfu Xie
PICV
21
16
0
24 Sep 2022
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
Mei-Shuo Chen
Z. Duan
22
10
0
23 Sep 2022
End-to-End Voice Conversion with Information Perturbation
Qicong Xie
Shan Yang
Yinjiao Lei
Linfu Xie
Dan Su
15
7
0
15 Jun 2022
Feature Learning and Ensemble Pre-Tasks Based Self-Supervised Speech Denoising and Dereverberation
Yi Li
ShuangLin Li
Yang Sun
S. M. Naqvi
8
0
0
10 Jun 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang
M. Hasegawa-Johnson
D. Roy
30
8
0
19 May 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers
Kaizhi Qian
Yang Zhang
Heting Gao
Junrui Ni
Cheng-I Jeff Lai
David D. Cox
M. Hasegawa-Johnson
Shiyu Chang
DRL
21
110
0
20 Apr 2022
Learning and controlling the source-filter representation of speech with a variational autoencoder
Samir Sadok
Simon Leglaive
Laurent Girin
Xavier Alameda-Pineda
Renaud Séguier
SSL
DRL
BDL
30
14
0
14 Apr 2022
Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Minchan Kim
Myeonghun Jeong
Byoung Jin Choi
Sunghwan Ahn
Joun Yeop Lee
N. Kim
36
25
0
29 Mar 2022
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models
Xiaoxiao Miao
Xin Wang
Erica Cooper
Junichi Yamagishi
N. Tomashenko
56
25
0
26 Feb 2022
Self-supervised Graphs for Audio Representation Learning with Limited Labeled Data
A. Shirian
Krishna Somandepalli
T. Guha
SSL
41
10
0
31 Jan 2022
The MSXF TTS System for ICASSP 2022 ADD Challenge
Chunyong Yang
Pengfei Liu
Yanli Chen
Hongbin Wang
Min Liu
10
0
0
27 Jan 2022
Training Robust Zero-Shot Voice Conversion Models with Self-supervised Features
Trung D. Q. Dang
Dung T. Tran
Peter Chin
K. Koishida
SSL
11
15
0
08 Dec 2021
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Edresson Casanova
Julian Weber
C. Shulby
Arnaldo Cândido Júnior
Eren Golge
M. Ponti
179
378
0
04 Dec 2021
Exploring wav2vec 2.0 on speaker verification and language identification
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
103
202
0
11 Dec 2020
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
189
288
0
25 Jan 2020
DDSP: Differentiable Digital Signal Processing
Jesse Engel
Lamtharn Hantrakul
Chenjie Gu
Adam Roberts
DiffM
94
373
0
14 Jan 2020
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
Ye Jia
Yu Zhang
Ron J. Weiss
Quan Wang
Jonathan Shen
...
Z. Chen
Patrick Nguyen
Ruoming Pang
Ignacio López Moreno
Yonghui Wu
207
819
0
12 Jun 2018
Previous
1
2
3
Next