Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.11480
Cited By
v1
v2 (latest)
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
25 October 2019
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram"
50 / 464 papers shown
Title
Watermarking Autoregressive Image Generation
Nikola Jovanović
Ismail Labiad
Tomáš Souček
Martin Vechev
Pierre Fernandez
WIGM
40
0
0
19 Jun 2025
Study of Lightweight Transformer Architectures for Single-Channel Speech Enhancement
Haixin Zhao
Nilesh Madhu
63
0
0
27 May 2025
ArVoice: A Multi-Speaker Dataset for Arabic Speech Synthesis
Hawau Olamide Toyin
Rufael Marew
Humaid Alblooshi
Samar M. Magdy
Hanan Aldarmaki
29
0
0
26 May 2025
STOPA: A Database of Systematic VariaTion Of DeePfake Audio for Open-Set Source Tracing and Attribution
Anton Firc
Manasi Chibber
Jagabandhu Mishra
Vishwanath Pratap Singh
Tomi Kinnunen
K. Malinka
150
0
0
26 May 2025
SpeakStream: Streaming Text-to-Speech with Interleaved Data
Richard He Bai
Zijin Gu
Tatiana Likhomanenko
Navdeep Jaitly
AuLLM
AI4TS
48
0
0
25 May 2025
Differentiable K-means for Fully-optimized Discrete Token-based ASR
Kentaro Onda
Yosuke Kashiwagi
E. Tsunoo
Hayato Futami
Shinji Watanabe
57
0
0
22 May 2025
Anti-aliasing of neural distortion effects via model fine tuning
Alistair Carson
Alec Wright
Stefan Bilbao
46
0
0
16 May 2025
DPN-GAN: Inducing Periodic Activations in Generative Adversarial Networks for High-Fidelity Audio Synthesis
Zeeshan Ahmad
Shudi Bao
Meng Chen
52
0
0
14 May 2025
Lightweight End-to-end Text-to-speech Synthesis for low resource on-device applications
Biel Tura Vecino
Adam Gabry's
Daniel Mątwicki
Andrzej Pomirski
Tom Iddon
Marius Cotescu
Jaime Lorenzo-Trueba
197
3
0
12 May 2025
Generative Adversarial Network based Voice Conversion: Techniques, Challenges, and Recent Advancements
Sandipan Dhar
N. D. Jana
Swagatam Das
79
0
0
27 Apr 2025
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
Mingfei Chen
I. D. Gebru
Ishwarya Ananthabhotla
Christian Richardt
Dejan Marković
Jake Sandakly
Steven Krenn
Todd Keebler
Eli Shlizerman
Alexander Richard
83
0
0
08 Apr 2025
Wireless Hearables With Programmable Speech AI Accelerators
Malek Itani
Tuochao Chen
Arun Raghavan
Gavriel Kohlberg
Shyamnath Gollakota
AuLLM
84
0
0
24 Mar 2025
Measuring the Robustness of Audio Deepfake Detectors
Xiang Li
Pin-Yu Chen
Wenqi Wei
79
0
0
21 Mar 2025
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
Tianze Luo
Xingchen Miao
Wenbo Duan
DiffM
91
0
0
20 Mar 2025
A Hypernetwork-Based Approach to KAN Representation of Audio Signals
Patryk Marszałek
Maciej Rut
Piotr Kawa
Przemysław Spurek
P. Syga
143
0
0
04 Mar 2025
FlowDec: A flow-based full-band general audio codec with high perceptual quality
Simon Welker
Matthew Le
Ricky T. Q. Chen
Wei-Ning Hsu
Timo Gerkmann
Alexander Richard
Yi-Chiao Wu
98
1
0
03 Mar 2025
NaturalL2S: End-to-End High-quality Multispeaker Lip-to-Speech Synthesis with Differential Digital Signal Processing
Yifan Liang
Fangkun Liu
Andong Li
Xiaodong Li
C. Zheng
96
1
0
17 Feb 2025
The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEG to Noisier EEG
Francesco Stefano Carzaniga
Gary Tom Hoppeler
Michael Hersche
Kaspar Anton Schindler
Abbas Rahimi
79
0
0
10 Feb 2025
Resampling Filter Design for Multirate Neural Audio Effect Processing
Alistair Carson
Vesa Valimaki
Alec Wright
Stefan Bilbao
142
1
0
30 Jan 2025
Memory-Centric Computing: Recent Advances in Processing-in-DRAM
O. Mutlu
Ataberk Olgun
Geraldo F. Oliveira
Ismail Emir Yüksel
121
4
0
26 Dec 2024
Robust AI-Synthesized Speech Detection Using Feature Decomposition Learning and Synthesizer Feature Augmentation
Kuiyuan Zhang
Zhongyun Hua
Yushu Zhang
Yifang Guo
Tao Xiang
59
3
0
14 Nov 2024
Wavehax: Aliasing-Free Neural Waveform Synthesis Based on 2D Convolution and Harmonic Prior for Reliable Complex Spectrogram Estimation
Reo Yoneyama
Atsushi Miyashita
Ryuichi Yamamoto
Tomoki Toda
70
2
0
11 Nov 2024
Acoustic Volume Rendering for Neural Impulse Response Fields
Zitong Lan
Chenhao Zheng
Zhiwei Zheng
Mingmin Zhao
73
4
0
09 Nov 2024
RDSinger: Reference-based Diffusion Network for Singing Voice Synthesis
Kehan Sui
Jinxu Xiang
Fang Jin
DiffM
45
0
0
29 Oct 2024
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
Luca Jiang-Tao Yu
Running Zhao
Sijie Ji
Edith C.H. Ngai
Chenshu Wu
52
0
0
29 Oct 2024
Optimal Transport Maps are Good Voice Converters
Arip Asadulaev
Rostislav Korst
V. Shutov
Alexander Korotin
Yaroslav Grebnyak
Vahe Egiazarian
Evgeny Burnaev
OT
55
2
0
17 Oct 2024
CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning
Sjoerd Groot
Qinyu Chen
Jan C. van Gemert
Chang Gao
Mamba
465
0
0
14 Oct 2024
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Yuto Nishimura
Takumi Hirose
Masanari Ohi
Hideki Nakayama
Nakamasa Inoue
VLM
112
2
0
06 Oct 2024
A Pilot Study of Applying Sequence-to-Sequence Voice Conversion to Evaluate the Intelligibility of L2 Speech Using a Native Speaker's Shadowings
Haopeng Geng
Daisuke Saito
Nobuaki Minematsu
71
3
0
03 Oct 2024
Exploring synthetic data for cross-speaker style transfer in style representation based TTS
Lucas Ueda
Leonardo B. de M. M. Marques
Flávio O. Simões
Mário Uliani Neto
Fernando Runstein
Bianca Dal Bó
Paula D. P. Costa
91
0
0
25 Sep 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
129
5
0
23 Sep 2024
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild
Jee-weon Jung
Yihan Wu
Xin Wang
Ji-Hoon Kim
Soumi Maiti
...
Joon Son Chung
Wangyou Zhang
Seyun Um
Shinnosuke Takamichi
Shinji Watanabe
155
4
0
18 Sep 2024
Discrete Unit based Masking for Improving Disentanglement in Voice Conversion
Philip H. Lee
Ismail Rasim Ulgen
Berrak Sisman
86
0
0
17 Sep 2024
SafeEar: Content Privacy-Preserving Audio Deepfake Detection
Xinfeng Li
Kai Li
Yifan Zheng
Chen Yan
Xiaoyu Ji
Wei Dong
83
16
0
14 Sep 2024
Text-To-Speech Synthesis In The Wild
Jee-weon Jung
Wangyou Zhang
Soumi Maiti
Yihan Wu
Xin Eric Wang
...
Hye-jin Shim
Nicholas W. D. Evans
Joon Son Chung
Shinnosuke Takamichi
Shinji Watanabe
100
2
0
13 Sep 2024
Janssen 2.0: Audio Inpainting in the Time-frequency Domain
Ondřej Mokrý
Peter Balušík
P. Rajmic
84
0
0
10 Sep 2024
Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
Hongfei Xue
Rong Gong
Mingchen Shao
Xin Xu
L. xilinx Wang
...
Yong Qin
Jun Du
Ming Li
Binbin Zhang
Bin Jia
74
2
0
09 Sep 2024
A multilingual training strategy for low resource Text to Speech
Asma Amalas
Mounir Ghogho
Mohamed Chetouani
Rachid Oulad Haj Thami
71
2
0
02 Sep 2024
Seeing Your Speech Style: A Novel Zero-Shot Identity-Disentanglement Face-based Voice Conversion
Yan Rong
Li Liu
62
5
0
01 Sep 2024
Unsupervised Composable Representations for Audio
Giovanni Bindi
P. Esling
DiffM
OCL
CoGe
79
1
0
19 Aug 2024
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
Sang-Hoon Lee
Ha-Yeong Choi
Seong-Whan Lee
OOD
DiffM
AI4TS
110
6
0
14 Aug 2024
VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders
Yubing Cao
Yongming Li
Liejun Wang
Yinfeng Yu
59
0
0
13 Aug 2024
ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
Jiangyan Yi
Chu Yuan Zhang
Jianhua Tao
Chenglong Wang
Xinrui Yan
Yong Ren
Hao Gu
Junzuo Zhou
95
5
0
09 Aug 2024
Central Kurdish Text-to-Speech Synthesis with Novel End-to-End Transformer Training
Hawraz A. Ahmad
Tarik A. Rashid
136
0
0
06 Aug 2024
Wavespace: A Highly Explorable Wavetable Generator
Hazounne Lee
Kihong Kim
Sungho Lee
Kyogu Lee
76
0
0
29 Jul 2024
Speech Editing -- a Summary
Tobias Kässmann
Yining Liu
Danni Liu
65
1
0
24 Jul 2024
Distortion Recovery: A Two-Stage Method for Guitar Effect Removal
Ying-Shuo Lee
Yueh-Po Peng
Jui-Te Wu
Ming Cheng
Li Su
Yi-Hsuan Yang
67
1
0
23 Jul 2024
dMel: Speech Tokenization made Simple
Richard He Bai
Tatiana Likhomanenko
Ruixiang Zhang
Zijin Gu
Zakaria Aldeneh
Navdeep Jaitly
113
6
0
22 Jul 2024
Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors
J. Hauret
Malo Olivier
Thomas Joubaud
C. Langrenne
Sarah Poirée
V. Zimpfer
Éric Bavu
185
5
0
16 Jul 2024
GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis
Weizhi Liu
Yue Li
Dongdong Lin
Hui Tian
Haizhou Li
WIGM
111
10
0
15 Jul 2024
1
2
3
4
...
8
9
10
Next