ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.06865
  4. Cited By
Sequence-to-Sequence Acoustic Modeling for Voice Conversion

Sequence-to-Sequence Acoustic Modeling for Voice Conversion

16 October 2018
Jing-Xuan Zhang
Zhenhua Ling
Li-Juan Liu
Yuan Jiang
Lirong Dai
ArXivPDFHTML

Papers citing "Sequence-to-Sequence Acoustic Modeling for Voice Conversion"

50 / 57 papers shown
Title
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
Xinlei Niu
Jing Zhang
Charles Patrick Martin
34
2
0
24 Apr 2024
Voice Attribute Editing with Text Prompt
Voice Attribute Editing with Text Prompt
Zheng-Yan Sheng
Yang Ai
Li-Juan Liu
Jia Pan
Zhenhua Ling
28
6
0
13 Apr 2024
AAS-VC: On the Generalization Ability of Automatic Alignment Search
  based Non-autoregressive Sequence-to-sequence Voice Conversion
AAS-VC: On the Generalization Ability of Automatic Alignment Search based Non-autoregressive Sequence-to-sequence Voice Conversion
Wen-Chin Huang
Kazuhiro Kobayashi
T. Toda
19
2
0
14 Sep 2023
Parallel and Limited Data Voice Conversion Using Stochastic Variational
  Deep Kernel Learning
Parallel and Limited Data Voice Conversion Using Stochastic Variational Deep Kernel Learning
Mohamadreza Jafaryani
H. Sheikhzadeh
V. Pourahmadi
19
4
0
08 Sep 2023
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive
  Speech Synthesis with Prosody Conditional Adversarial Training
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training
H. Oh
Sang-Hoon Lee
Seong-Whan Lee
DiffM
28
14
0
31 Jul 2023
Rhythm Modeling for Voice Conversion
Rhythm Modeling for Voice Conversion
Benjamin van Niekerk
M. Carbonneau
Herman Kamper
40
5
0
12 Jul 2023
PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and
  Pause-based Prosody Modeling
PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and Pause-based Prosody Modeling
Ji-Sang Hwang
Sang-Hoon Lee
Seong-Whan Lee
24
4
0
13 Jun 2023
Direct Speech-to-speech Translation without Textual Annotation using
  Bottleneck Features
Direct Speech-to-speech Translation without Textual Annotation using Bottleneck Features
Junhui Zhang
Junjie Pan
Xiang Yin
Zejun Ma
27
0
0
12 Dec 2022
Two-stage training method for Japanese electrolaryngeal speech
  enhancement based on sequence-to-sequence voice conversion
Two-stage training method for Japanese electrolaryngeal speech enhancement based on sequence-to-sequence voice conversion
D. Ma
Lester Phillip Violeta
Kazuhiro Kobayashi
T. Toda
29
6
0
19 Oct 2022
A Comparative Study of Self-supervised Speech Representation Based Voice
  Conversion
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion
Wen-Chin Huang
Shu-Wen Yang
Tomoki Hayashi
T. Toda
21
15
0
10 Jul 2022
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Jiachen Lian
Chunlei Zhang
Gopala Krishna Anumanchipalli
Dong Yu
26
23
0
11 May 2022
Robust Disentangled Variational Speech Representation Learning for
  Zero-shot Voice Conversion
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Jiachen Lian
Chunlei Zhang
Dong Yu
DRL
30
51
0
30 Mar 2022
Disentangleing Content and Fine-grained Prosody Information via Hybrid
  ASR Bottleneck Features for Voice Conversion
Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion
Xintao Zhao
Feng Liu
Changhe Song
Zhiyong Wu
Shiyin Kang
Deyi Tuo
Helen Meng
26
21
0
24 Mar 2022
Phase-Aware Spoof Speech Detection Based on Res2Net with Phase Network
Phase-Aware Spoof Speech Detection Based on Res2Net with Phase Network
Juntae Kim
S. Ban
29
18
0
21 Mar 2022
Real time spectrogram inversion on mobile phone
Real time spectrogram inversion on mobile phone
Oleg Rybakov
Marco Tagliasacchi
Yunpeng Li
Liyang Jiang
Xia Zhang
Fadi Biadsy
23
4
0
01 Mar 2022
Noise-robust voice conversion with domain adversarial training
Noise-robust voice conversion with domain adversarial training
Hongqiang Du
Lei Xie
Haizhou Li
19
11
0
26 Jan 2022
Emotion Intensity and its Control for Emotional Voice Conversion
Emotion Intensity and its Control for Emotional Voice Conversion
Kun Zhou
Berrak Sisman
R. Rana
Björn W. Schuller
Haizhou Li
65
54
0
10 Jan 2022
Towards Identity Preserving Normal to Dysarthric Voice Conversion
Towards Identity Preserving Normal to Dysarthric Voice Conversion
Wen-Chin Huang
B. Halpern
Lester Phillip Violeta
O. Scharenborg
T. Toda
44
21
0
15 Oct 2021
On Prosody Modeling for ASR+TTS based Voice Conversion
On Prosody Modeling for ASR+TTS based Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Xinjian Li
Shinji Watanabe
T. Toda
35
8
0
20 Jul 2021
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker
  Identity in Dysarthric Voice Conversion
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion
Wen-Chin Huang
Kazuhiro Kobayashi
Yu-Huai Peng
Ching-Feng Liu
Yu Tsao
Hsin-Min Wang
T. Toda
23
10
0
02 Jun 2021
Emotional Voice Conversion: Theory, Databases and ESD
Emotional Voice Conversion: Theory, Databases and ESD
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
33
168
0
31 May 2021
FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice
  Conversion
FastS2S-VC: Streaming Non-Autoregressive Sequence-to-Sequence Voice Conversion
Hirokazu Kameoka
Kou Tanaka
Takuhiro Kaneko
39
21
0
14 Apr 2021
Beyond Categorical Label Representations for Image Classification
Beyond Categorical Label Representations for Image Classification
Boyuan Chen
Yu Li
Sunand Raghupathi
Hod Lipson
SSL
37
2
0
06 Apr 2021
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech:
  Two-stage Sequence-to-Sequence Training
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training
Kun Zhou
Berrak Sisman
Haizhou Li
23
27
0
31 Mar 2021
Attention, please! A survey of Neural Attention Models in Deep Learning
Attention, please! A survey of Neural Attention Models in Deep Learning
Alana de Santana Correia
Esther Luna Colombini
HAI
23
175
0
31 Mar 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in
  Frames
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
38
57
0
25 Feb 2021
Accent and Speaker Disentanglement in Many-to-many Voice Conversion
Accent and Speaker Disentanglement in Many-to-many Voice Conversion
Zhichao Wang
Wenshuo Ge
Xiong Wang
Shan Yang
Wendong Gan
Haitao Chen
Hai Li
Lei Xie
Xiulin Li
CVBM
36
32
0
17 Nov 2020
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised
  Discrete Speech Representations
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations
Wen-Chin Huang
Yi-Chiao Wu
Tomoki Hayashi
T. Toda
BDL
54
37
0
23 Oct 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020:
  On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural
  Vocoders
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders
Wen-Chin Huang
Patrick Lumban Tobing
Yi-Chiao Wu
Kazuhiro Kobayashi
T. Toda
19
8
0
09 Oct 2020
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge
  2020: Cascading ASR and TTS
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS
Wen-Chin Huang
Tomoki Hayashi
Shinji Watanabe
T. Toda
DRL
15
39
0
06 Oct 2020
Transfer Learning from Speech Synthesis to Voice Conversion with
  Non-Parallel Training Data
Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data
Mingyang Zhang
Yi Zhou
Li Zhao
Haizhou Li
24
52
0
30 Sep 2020
Predictions of Subjective Ratings and Spoofing Assessments of Voice
  Conversion Challenge 2020 Submissions
Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions
Rohan Kumar Das
Tomi Kinnunen
Wen-Chin Huang
Zhenhua Ling
Junichi Yamagishi
Yi Zhao
Xiaohai Tian
T. Toda
31
52
0
08 Sep 2020
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence
  Modeling
Any-to-Many Voice Conversion with Location-Relative Sequence-to-Sequence Modeling
Songxiang Liu
Yuewen Cao
Disong Wang
Xixin Wu
Xunying Liu
Helen Meng
BDL
29
88
0
06 Sep 2020
Voice Conversion by Cascading Automatic Speech Recognition and
  Text-to-Speech Synthesis with Prosody Transfer
Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer
Jing-Xuan Zhang
Li-Juan Liu
Yan-Nian Chen
Ya-Jun Hu
Yuan Jiang
Zhenhua Ling
Lirong Dai
13
17
0
03 Sep 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
41
318
0
09 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
T. Toda
27
38
0
07 Aug 2020
Recognition-Synthesis Based Non-Parallel Voice Conversion with
  Adversarial Learning
Recognition-Synthesis Based Non-Parallel Voice Conversion with Adversarial Learning
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
15
6
0
05 Aug 2020
A Pyramid Recurrent Network for Predicting Crowdsourced Speech-Quality
  Ratings of Real-World Signals
A Pyramid Recurrent Network for Predicting Crowdsourced Speech-Quality Ratings of Real-World Signals
Xuan Dong
Donald Williamson
22
20
0
31 Jul 2020
Many-to-Many Voice Transformer Network
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
T. Toda
ViT
30
30
0
18 May 2020
ConVoice: Real-Time Zero-Shot Voice Style Transfer with Convolutional
  Network
ConVoice: Real-Time Zero-Shot Voice Style Transfer with Convolutional Network
Yurii Rebryk
Stanislav Beliaev
16
8
0
15 May 2020
End-to-End Whisper to Natural Speech Conversion using Modified
  Transformer Network
End-to-End Whisper to Natural Speech Conversion using Modified Transformer Network
Abhishek Niranjan
Mukesh Sharma
Sai Bharath Chandra Gutha
M. Shaik
22
1
0
20 Apr 2020
Anchor Attention for Hybrid Crowd Forecasts Aggregation
Anchor Attention for Hybrid Crowd Forecasts Aggregation
Yuzhong Huang
A. Abeliuk
Fred Morstatter
P. Atanasov
Aram Galstyan
19
3
0
03 Mar 2020
Mel-spectrogram augmentation for sequence to sequence voice conversion
Mel-spectrogram augmentation for sequence to sequence voice conversion
Yeongtae Hwang
Hyemin Cho
Hongsun Yang
Dong-Ok Won
Insoo Oh
Seong-Whan Lee
30
15
0
06 Jan 2020
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using
  Transformer with Text-to-Speech Pretraining
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
T. Toda
30
95
0
14 Dec 2019
Emotional Voice Conversion using Multitask Learning with Text-to-speech
Emotional Voice Conversion using Multitask Learning with Text-to-speech
Tae-Ho Kim
Sungjae Cho
Shinkook Choi
Sejik Park
Soo-Young Lee
27
37
0
11 Nov 2019
Statistical Voice Conversion with Quasi-Periodic WaveNet Vocoder
Statistical Voice Conversion with Quasi-Periodic WaveNet Vocoder
Yi-Chiao Wu
Patrick Lumban Tobing
Tomoki Hayashi
Kazuhiro Kobayashi
T. Toda
21
2
0
21 Jul 2019
Hierarchical Sequence to Sequence Voice Conversion with Limited Data
Hierarchical Sequence to Sequence Voice Conversion with Limited Data
P. Narayanan
Punarjay Chakravarty
F. Charette
G. Puskorius
23
3
0
15 Jul 2019
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled
  Linguistic and Speaker Representations
Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations
Jing-Xuan Zhang
Zhenhua Ling
Lirong Dai
22
99
0
25 Jun 2019
Investigation of F0 conditioning and Fully Convolutional Networks in
  Variational Autoencoder based Voice Conversion
Investigation of F0 conditioning and Fully Convolutional Networks in Variational Autoencoder based Voice Conversion
Wen-Chin Huang
Yi-Chiao Wu
Chen-Chou Lo
Patrick Lumban Tobing
Tomoki Hayashi
Kazuhiro Kobayashi
T. Toda
Yu Tsao
H. Wang
DRL
24
13
0
02 May 2019
Direct speech-to-speech translation with a sequence-to-sequence model
Direct speech-to-speech translation with a sequence-to-sequence model
Ye Jia
Ron J. Weiss
Fadi Biadsy
Wolfgang Macherey
Melvin Johnson
Zhehuai Chen
Yonghui Wu
21
223
0
12 Apr 2019
12
Next