ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1804.04262
  4. Cited By
The Voice Conversion Challenge 2018: Promoting Development of Parallel
  and Nonparallel Methods

The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods

12 April 2018
Jaime Lorenzo-Trueba
Junichi Yamagishi
T. Toda
Daisuke Saito
F. Villavicencio
Tomi Kinnunen
Zhenhua Ling
ArXivPDFHTML

Papers citing "The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods"

47 / 47 papers shown
Title
FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning
FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep Learning
Ju Yeon Kang
J. Yoon
Semin Kim
Min Hyun Han
Nam Soo Kim
32
0
0
22 Apr 2025
Harder or Different? Understanding Generalization of Audio Deepfake
  Detection
Harder or Different? Understanding Generalization of Audio Deepfake Detection
Nicolas M. Muller
Nicholas W. D. Evans
Hemlata Tak
Philip Sperl
Konstantin Böttinger
29
3
0
05 Jun 2024
Audio Anti-Spoofing Detection: A Survey
Audio Anti-Spoofing Detection: A Survey
Menglu Li
Yasaman Ahmadiadli
Xiao-Ping Zhang
48
17
0
22 Apr 2024
Explainable Deepfake Video Detection using Convolutional Neural Network
  and CapsuleNet
Explainable Deepfake Video Detection using Convolutional Neural Network and CapsuleNet
Gazi Hasin Ishrak
Zalish Mahmud
Md. Zami Al Zunaed Farabe
Tahera Khanom Tinni
Tanzim Reza
Mohammad Zavid Parvez
48
3
0
19 Apr 2024
Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio
  Anti-spoofing
Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing
Hye-jin Shim
Jee-weon Jung
Tomi Kinnunen
21
13
0
31 May 2023
Voice conversion with limited data and limitless data augmentations
Voice conversion with limited data and limitless data augmentations
Olga Slizovskaia
Jordi Janer
Pritish Chandna
Oscar Mayor
30
1
0
27 Dec 2022
RedPen: Region- and Reason-Annotated Dataset of Unnatural Speech
RedPen: Region- and Reason-Annotated Dataset of Unnatural Speech
Kyumin Park
Keon Lee
Daeyoung Kim
Dongyeop Kang
26
0
0
26 Oct 2022
ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Xuechen Liu
Xin Wang
Md. Sahidullah
J. Patino
Héctor Delgado
...
Massimiliano Todisco
Junichi Yamagishi
Nicholas W. D. Evans
A. Nautsch
Kong Aik Lee
40
173
0
05 Oct 2022
Controllable Accented Text-to-Speech Synthesis
Controllable Accented Text-to-Speech Synthesis
Rui Liu
Berrak Sisman
Guanglai Gao
Haizhou Li
34
6
0
22 Sep 2022
Deepfake: Definitions, Performance Metrics and Standards, Datasets and
  Benchmarks, and a Meta-Review
Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review
Enes ALTUNCU
V. N. Franqueira
Shujun Li
28
11
0
21 Aug 2022
Comparison of Speech Representations for the MOS Prediction System
Comparison of Speech Representations for the MOS Prediction System
A. Kunikoshi
Jaebok Kim
Won-Suk Jun
K. Sjölander
13
1
0
28 Jun 2022
Joint Optimization of Sampling Rate Offsets Based on Entire Signal
  Relationship Among Distributed Microphones
Joint Optimization of Sampling Rate Offsets Based on Entire Signal Relationship Among Distributed Microphones
Yoshiki Masuyama
K. Yamaoka
Nobutaka Ono
28
5
0
27 Jun 2022
Fusion of Self-supervised Learned Models for MOS Prediction
Fusion of Self-supervised Learned Models for MOS Prediction
Zhengdong Yang
Wangjin Zhou
Chenhui Chu
Sheng Li
Raj Dabre
Raphaël Rubino
Yi Zhao
25
28
0
11 Apr 2022
HiFi++: a Unified Framework for Bandwidth Extension and Speech
  Enhancement
HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement
Pavel Andreev
Aibek Alanov
Oleg Ivanov
Dmitry Vetrov
38
38
0
24 Mar 2022
NORESQA: A Framework for Speech Quality Assessment using Non-Matching
  References
NORESQA: A Framework for Speech Quality Assessment using Non-Matching References
Pranay Manocha
Buye Xu
Anurag Kumar
35
44
0
16 Sep 2021
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
Cheng-Hung Hu
Yu-Huai Peng
Junichi Yamagishi
Yu Tsao
Hsin-Min Wang
29
5
0
20 Jul 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Speech is Silver, Silence is Golden: What do ASVspoof-trained Models
  Really Learn?
Speech is Silver, Silence is Golden: What do ASVspoof-trained Models Really Learn?
Nicolas M. Muller
Franziska Dieckmann
Pavel Czempin
Roman Canals
Konstantin Böttinger
Jennifer Williams
35
70
0
23 Jun 2021
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional
  Text-to-Speech Model
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model
Chenye Cui
Yi Ren
Jinglin Liu
Feiyang Chen
Rongjie Huang
Ming Lei
Zhou Zhao
24
35
0
17 Jun 2021
Emotional Voice Conversion: Theory, Databases and ESD
Emotional Voice Conversion: Theory, Databases and ESD
Kun Zhou
Berrak Sisman
Rui Liu
Haizhou Li
25
168
0
31 May 2021
Deep Learning Based Assessment of Synthetic Speech Naturalness
Deep Learning Based Assessment of Synthetic Speech Naturalness
Gabriel Mittag
Sebastian Möller
20
62
0
23 Apr 2021
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in
  Frames
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
33
57
0
25 Feb 2021
CDPAM: Contrastive learning for perceptual audio similarity
CDPAM: Contrastive learning for perceptual audio similarity
Pranay Manocha
Zeyu Jin
Richard Y. Zhang
Adam Finkelstein
27
68
0
09 Feb 2021
Optimizing voice conversion network with cycle consistency loss of
  speaker identity
Optimizing voice conversion network with cycle consistency loss of speaker identity
Hongqiang Du
Xiaohai Tian
Lei Xie
Haizhou Li
21
17
0
17 Nov 2020
Low-resource expressive text-to-speech using data augmentation
Low-resource expressive text-to-speech using data augmentation
Goeric Huybrechts
Thomas Merritt
Giulia Comini
Bartek Perz
Raahil Shah
Jaime Lorenzo-Trueba
26
50
0
11 Nov 2020
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram
  Conversion
CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Nobukatsu Hojo
26
78
0
22 Oct 2020
Predictions of Subjective Ratings and Spoofing Assessments of Voice
  Conversion Challenge 2020 Submissions
Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions
Rohan Kumar Das
Tomi Kinnunen
Wen-Chin Huang
Zhenhua Ling
Junichi Yamagishi
Yi Zhao
Xiaohai Tian
T. Toda
31
52
0
08 Sep 2020
Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Yeunju Choi
Youngmoon Jung
Hoirin Kim
14
26
0
09 Aug 2020
An Overview of Voice Conversion and its Challenges: From Statistical
  Modeling to Deep Learning
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning
Berrak Sisman
Junichi Yamagishi
Simon King
Haizhou Li
BDL
41
317
0
09 Aug 2020
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Pretraining Techniques for Sequence-to-Sequence Voice Conversion
Wen-Chin Huang
Tomoki Hayashi
Yi-Chiao Wu
Hirokazu Kameoka
T. Toda
27
38
0
07 Aug 2020
Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning
  With Spoofing Detection and Spoofing Type Classification
Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning With Spoofing Detection and Spoofing Type Classification
Yeunju Choi
Youngmoon Jung
Hoirin Kim
16
27
0
16 Jul 2020
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model
  with Pitch-dependent Dilated Convolution Neural Network
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network
Yi-Chiao Wu
Tomoki Hayashi
Patrick Lumban Tobing
Kazuhiro Kobayashi
T. Toda
27
18
0
11 Jul 2020
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake
  Voices
DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices
Run Wang
Felix Juefei Xu
Yihao Huang
Qing Guo
Xiaofei Xie
Lei Ma
Yang Liu
AAML
25
105
0
28 May 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive
  Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation
Yi-Chiao Wu
Tomoki Hayashi
T. Okamoto
Hisashi Kawai
T. Toda
29
4
0
18 May 2020
Many-to-Many Voice Transformer Network
Many-to-Many Voice Transformer Network
Hirokazu Kameoka
Wen-Chin Huang
Kou Tanaka
Takuhiro Kaneko
Nobukatsu Hojo
T. Toda
ViT
30
30
0
18 May 2020
Introducing the VoicePrivacy Initiative
Introducing the VoicePrivacy Initiative
N. Tomashenko
B. M. L. Srivastava
Xin Wang
Emmanuel Vincent
A. Nautsch
...
Nicholas W. D. Evans
J. Patino
J. Bonastre
Paul-Gauthier Noé
Massimiliano Todisco
40
127
0
04 May 2020
The Attacker's Perspective on Automatic Speaker Verification: An
  Overview
The Attacker's Perspective on Automatic Speaker Verification: An Overview
Rohan Kumar Das
Xiaohai Tian
Tomi Kinnunen
Haizhou Li
AAML
20
81
0
19 Apr 2020
SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech
  Translation
SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech Translation
Arya D. McCarthy
Liezl Puzon
J. Pino
31
24
0
27 Feb 2020
Many-to-Many Voice Conversion using Conditional Cycle-Consistent
  Adversarial Networks
Many-to-Many Voice Conversion using Conditional Cycle-Consistent Adversarial Networks
Shindong Lee
Bonggu Ko
Keonnyeong Lee
In-Chul Yoo
Dongsuk Yook
GAN
30
33
0
15 Feb 2020
Unsupervised Representation Disentanglement using Cross Domain Features
  and Adversarial Learning in Variational Autoencoder based Voice Conversion
Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion
Wen-Chin Huang
Hao Luo
Hsin-Te Hwang
Chen-Chou Lo
Yu-Huai Peng
Yu Tsao
Hsin-Min Wang
DRL
17
42
0
22 Jan 2020
Multi-task Learning For Detecting and Segmenting Manipulated Facial
  Images and Videos
Multi-task Learning For Detecting and Segmenting Manipulated Facial Images and Videos
H. Nguyen
Fuming Fang
Junichi Yamagishi
Isao Echizen
AAML
CVBM
26
424
0
17 Jun 2019
Voice Mimicry Attacks Assisted by Automatic Speaker Verification
Voice Mimicry Attacks Assisted by Automatic Speaker Verification
Ville Vestman
Tomi Kinnunen
Rosa González Hautamäki
Md. Sahidullah
34
37
0
03 Jun 2019
TTS Skins: Speaker Conversion via ASR
TTS Skins: Speaker Conversion via ASR
Adam Polyak
Lior Wolf
Yaniv Taigman
18
27
0
18 Apr 2019
Generalized Multichannel Variational Autoencoder for Underdetermined
  Source Separation
Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation
Shogo Seki
Hirokazu Kameoka
Li Li
T. Toda
K. Takeda
DRL
14
19
0
29 Sep 2018
ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary
  classifier variational autoencoder
ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary classifier variational autoencoder
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
DRL
16
59
0
13 Aug 2018
StarGAN-VC: Non-parallel many-to-many voice conversion with star
  generative adversarial networks
StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks
Hirokazu Kameoka
Takuhiro Kaneko
Kou Tanaka
Nobukatsu Hojo
34
370
0
06 Jun 2018
Collapsed speech segment detection and suppression for WaveNet vocoder
Collapsed speech segment detection and suppression for WaveNet vocoder
Yi-Chiao Wu
Kazuhiro Kobayashi
Tomoki Hayashi
Patrick Lumban Tobing
T. Toda
7
25
0
30 Apr 2018
1