Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.11480
Cited By
v1
v2 (latest)
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
25 October 2019
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"
50 / 464 papers shown
Title
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability
Hyun Joon Park
Jin Sob Kim
Wooseok Shin
Sung Won Han
DiffM
65
3
0
27 Jun 2024
A Study on Synthesizing Expressive Violin Performances: Approaches and Comparisons
Tzu-Yun Hung
Jui-Te Wu
Yu-Chia Kuo
Yo-Wei Hsiao
Ting-Wei Lin
Li Su
65
0
0
26 Jun 2024
Improving Unsupervised Clean-to-Rendered Guitar Tone Transformation Using GANs and Integrated Unaligned Clean Data
Yu-Hua Chen
Woosung Choi
Wei-Hsiang Liao
Marco A. Martínez-Ramírez
K. Cheuk
Yuki Mitsufuji
J. Jang
Yi-Hsuan Yang
79
5
0
22 Jun 2024
End-to-end Streaming model for Low-Latency Speech Anonymization
Waris Quamer
Ricardo Gutierrez-Osuna
96
0
0
13 Jun 2024
JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis
Hyunjae Cho
Junhyeok Lee
Wonbin Jung
51
0
0
10 Jun 2024
MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance
Semin Kim
Myeonghun Jeong
Hyeonseung Lee
Minchan Kim
Byoung Jin Choi
Nam Soo Kim
VLM
DiffM
108
1
0
10 Jun 2024
Approximated Coded Computing: Towards Fast, Private and Secure Distributed Machine Learning
Houming Qiu
Kun Zhu
Nguyen Cong Luong
Dusit Niyato
FedML
73
0
0
07 Jun 2024
Neural Codec-based Adversarial Sample Detection for Speaker Verification
Xuanjun Chen
Jiawei Du
Haibin Wu
Jyh-Shing Roger Jang
Hung-yi Lee
73
3
0
07 Jun 2024
Searching For Music Mixing Graphs: A Pruning Approach
Sungho Lee
Marco A. Martínez-Ramírez
Wei-Hsiang Liao
Stefan Uhlich
Giorgio Fabbro
Kyogu Lee
Yuki Mitsufuji
102
3
0
03 Jun 2024
NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields
Amandine Brunetto
Sascha Hornauer
Fabien Moutarde
145
2
0
28 May 2024
Ambisonizer: Neural Upmixing as Spherical Harmonics Generation
Yongyi Zang
Yifan Wang
Minglun Lee
46
1
0
22 May 2024
Exploring speech style spaces with language models: Emotional TTS without emotion labels
Shreeram Suresh Chandra
Zongyang Du
Berrak Sisman
76
2
0
18 May 2024
Building a Luganda Text-to-Speech Model From Crowdsourced Data
Sulaiman Kagumire
Andrew Katumba
J. Nakatumba‐Nabende
John Quinn
33
1
0
16 May 2024
The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio
Yuankun Xie
Yi Lu
Ruibo Fu
Zhengqi Wen
Zhiyong Wang
...
Xiaopeng Wang
Yukun Liu
Haonan Cheng
Long Ye
Yi Sun
98
21
0
08 May 2024
HILCodec: High Fidelity and Lightweight Neural Audio Codec
S. Ahn
Beom Jun Woo
Mingrui Han
Chanyeong Moon
Nam Soo Kim
43
9
0
08 May 2024
TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms
Yueyuan Sui
Minghui Zhao
Junxi Xia
Xiaofan Jiang
S. Xia
Mamba
93
11
0
02 May 2024
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Haizhou Li
Zhizheng Wu
55
3
0
26 Apr 2024
Music Style Transfer With Diffusion Model
Hong Huang
Yuyi Wang
Luyao Li
Jun Lin
DiffM
55
0
0
23 Apr 2024
Differentiable All-pole Filters for Time-varying Audio Systems
Chin-Yun Yu
Christopher Mitcheltree
Alistair Carson
Stefan Bilbao
Joshua D. Reiss
Gyorgy Fazekas
86
3
0
11 Apr 2024
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
Ziyang Chen
I. D. Gebru
Christian Richardt
Anurag Kumar
William Laney
Andrew Owens
Alexander Richard
112
20
0
27 Mar 2024
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
52
0
0
25 Mar 2024
Modeling Analog Dynamic Range Compressors using Deep Learning and State-space Models
Hanzhi Yin
Gang Cheng
Christian J. Steinmetz
Ruibin Yuan
Richard M. Stern
Roger B. Dannenberg
51
6
0
24 Mar 2024
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech
Ziqi Liang
Haoxiang Shi
Jiawei Wang
Keda Lu
75
0
0
13 Mar 2024
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
Peng Liu
Dongyang Dai
Zhiyong Wu
144
3
0
08 Mar 2024
Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification
Sonal Joshi
Thomas Thebaud
Jesús Villalba
Najim Dehak
AAML
55
1
0
29 Feb 2024
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
Yukiya Hono
Kei Hashimoto
Yoshihiko Nankaku
Keiichi Tokuda
DiffM
67
3
0
22 Feb 2024
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
Shijia Liao
Shiyi Lan
Arun George Zachariah
38
1
0
31 Jan 2024
MunTTS: A Text-to-Speech System for Mundari
Varun Gumma
Rishav Hada
Aditya Yadavalli
Pamir Gogoi
Ishani Mondal
Vivek Seshadri
Kalika Bali
59
1
0
28 Jan 2024
BAE-Net: A Low complexity and high fidelity Bandwidth-Adaptive neural network for speech super-resolution
Guochen Yu
Xiguang Zheng
Nan Li
Runqiang Han
C. Zheng
Chen Zhang
Chao Zhou
Qi Huang
Bin Yu
129
6
0
21 Dec 2023
BrainTalker: Low-Resource Brain-to-Speech Synthesis with Transfer Learning using Wav2Vec 2.0
Miseul Kim
Zhenyu Piao
Jihyun Lee
Hong-Goo Kang
129
3
0
21 Dec 2023
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Xueyao Zhang
Liumeng Xue
Yicheng Gu
Yuancheng Wang
Haorui He
...
Mingxuan Wang
Jun Han
Kai Chen
Haizhou Li
Zhizheng Wu
91
35
0
15 Dec 2023
FlowMur: A Stealthy and Practical Audio Backdoor Attack with Limited Knowledge
Jiahe Lan
Jie Wang
Baochen Yan
Zheng Yan
Elisa Bertino
AAML
103
11
0
15 Dec 2023
Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification
Haibin Wu
Heng-Cheng Kuo
Yu Tsao
Hung-yi Lee
AAML
62
2
0
14 Dec 2023
ROSE: A Recognition-Oriented Speech Enhancement Framework in Air Traffic Control Using Multi-Objective Learning
Xincheng Yu
Dongyue Guo
Jianwei Zhang
Yi Lin
53
3
0
11 Dec 2023
A Representative Study on Human Detection of Artificially Generated Media Across Countries
Joel Frank
Franziska Herbert
Jonas Ricker
Lea Schonherr
Thorsten Eisenhofer
Asja Fischer
Markus Dürmuth
Thorsten Holz
93
15
0
10 Dec 2023
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
Binzhu Sha
Xu Li
Zhiyong Wu
Yin Shan
Helen M. Meng
54
7
0
08 Dec 2023
Investigating the Design Space of Diffusion Models for Speech Enhancement
Philippe Gonzalez
Zheng-Hua Tan
Jan Østergaard
Jesper Jensen
T. S. Alstrøm
Tobias May
DiffM
72
8
0
07 Dec 2023
Detecting Voice Cloning Attacks via Timbre Watermarking
Chang-rui Liu
Jie Zhang
Tianwei Zhang
Xi Yang
Weiming Zhang
Neng H. Yu
97
38
0
06 Dec 2023
Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning
Raviraj Joshi
Nikesh Garera
73
2
0
02 Dec 2023
Code-Mixed Text to Speech Synthesis under Low-Resource Constraints
Raviraj Joshi
Nikesh Garera
79
0
0
02 Dec 2023
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder
Yicheng Gu
Xueyao Zhang
Liumeng Xue
Zhizheng Wu
72
12
0
25 Nov 2023
Controllable Music Production with Diffusion Models and Guidance Gradients
Mark Levy
Bruno Di Giorgi
Floris Weers
Angelos Katharopoulos
Tom Nickson
DiffM
119
23
0
01 Nov 2023
High-Fidelity Noise Reduction with Differentiable Signal Processing
C. Steinmetz
Thomas Walther
Joshua D. Reiss
50
3
0
17 Oct 2023
Comparative Analysis of Transfer Learning in Deep Learning Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset
Ze Liu
48
1
0
08 Oct 2023
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction
Jiatong Shi
Hirofumi Inaguma
Xutai Ma
Ilia Kulikov
Anna Y. Sun
115
27
0
04 Oct 2023
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform
Yinghao Aaron Li
Cong Han
Xilin Jiang
N. Mesgarani
101
4
0
18 Sep 2023
Face-Driven Zero-Shot Voice Conversion with Memory-based Face-Voice Alignment
Zheng-Yan Sheng
Yang Ai
Yan-Nian Chen
Zhenhua Ling
CVBM
53
4
0
18 Sep 2023
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias
Sipan Li
Songxiang Liu
Lu Zhang
Xiang Li
Yanyao Bian
Chao Weng
Zhiyong Wu
Helen Meng
45
2
0
14 Sep 2023
Distinguishing Neural Speech Synthesis Models Through Fingerprints in Speech Waveforms
Chu Yuan Zhang
Jiangyan Yi
Jianhua Tao
Chenglong Wang
Xinrui Yan
87
8
0
13 Sep 2023
Differentiable Modelling of Percussive Audio with Transient and Spectral Synthesis
Jordie Shier
Franco Caspe
Andrew Robertson
Mark Sandler
C. Saitis
Andrew Mcpherson
66
3
0
13 Sep 2023
Previous
1
2
3
4
5
...
8
9
10
Next