High Fidelity Speech Synthesis with Adversarial Networks

25 September 2019

Papers citing "High Fidelity Speech Synthesis with Adversarial Networks"

50 / 149 papers shown

Title
DPN-GAN: Inducing Periodic Activations in Generative Adversarial Networks for High-Fidelity Audio Synthesis Zeeshan Ahmad Shudi Bao Meng Chen 15 0 0 14 May 2025
Hierarchical Conditional Tabular GAN for Multi-Tabular Synthetic Data Generation Wilhelm Ågren Victorio Úbeda Sosa 30 0 0 11 Nov 2024
Generative Deep Learning and Signal Processing for Data Augmentation of Cardiac Auscultation Signals: Improving Model Robustness Using Synthetic Audio Leigh Abbott Milan Marocchi Matthew Fynn Yue Rong Sven Nordholm MedIm 20 0 0 14 Oct 2024
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation Sang-Hoon Lee Ha-Yeong Choi Seong-Whan Lee OOD DiffM AI4TS 43 5 0 14 Aug 2024
Coarse-to-Fine Proposal Refinement Framework for Audio Temporal Forgery Detection and Localization Junyan Wu Wei Lu Xiangyang Luo Rui Yang Qian Wang Xiaochun Cao 34 3 0 23 Jul 2024
A Survey of Deep Learning Audio Generation Methods Matej Bozic Marko Horvat VLM MedIm 52 0 0 31 May 2024
Sok: Comprehensive Security Overview, Challenges, and Future Directions of Voice-Controlled Systems Haozhe Xu Cong Wu Yangyang Gu Xingcan Shang Jing Chen Kun He Ruiying Du 32 3 0 27 May 2024
Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model Xiangyu Zhang Daijiao Liu Hexin Liu Qiquan Zhang Hanyu Meng Leibny Paola García Chng Eng Siong Lina Yao DiffM 15 2 0 16 Feb 2024
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy Weijian Mai Jian Zhang Pengfei Fang Zhijun Zhang 42 9 0 31 Dec 2023
The Effects of Signal-to-Noise Ratio on Generative Adversarial Networks Applied to Marine Bioacoustic Data Georgia Atkinson Nick Wright A. Mcgough Per Berggren GAN 13 0 0 22 Dec 2023
A Representative Study on Human Detection of Artificially Generated Media Across Countries Joel Frank Franziska Herbert Jonas Ricker Lea Schonherr Thorsten Eisenhofer Asja Fischer Markus Dürmuth Thorsten Holz 25 12 0 10 Dec 2023
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores Daniel Y. Fu Hermann Kumbong Eric N. D. Nguyen Christopher Ré VLM 36 29 0 10 Nov 2023
Enabling Acoustic Audience Feedback in Large Virtual Events Tamay Aykut M. Hofbauer Christopher B. Kuhn Eckehard Steinbach Bernd Girod 38 0 0 27 Oct 2023
Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech Dareen Alharthi Roshan S. Sharma Hira Dhamyal Soumi Maiti Bhiksha Raj Rita Singh 21 4 0 01 Oct 2023
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching Yiwei Guo Chenpeng Du Ziyang Ma Xie Chen K. Yu DiffM 25 36 0 10 Sep 2023
AI-Generated Content (AIGC) for Various Data Modalities: A Survey Lin Geng Foo Hossein Rahmani J. Liu 73 31 0 27 Aug 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design Jungil Kong Jihoon Park Beomjeong Kim Jeongmin Kim Dohee Kong Sangjin Kim 21 35 0 31 Jul 2023
The Ethical Implications of Generative Audio Models: A Systematic Literature Review J. Barnett 16 25 0 07 Jul 2023
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading Yochai Yemini Aviv Shamsian Lior Bracha Sharon Gannot Ethan Fetaya DiffM 11 9 0 05 Jun 2023
UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model A. Iashchenko Pavel Andreev Ivan Shchekotov Nicholas Babaev Dmitry Vetrov DiffM 16 1 0 01 Jun 2023
U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech Xin Jing Yi Chang Zijiang Yang Jiang-jian Xie Andreas Triantafyllopoulos Bjoern W. Schuller 26 10 0 22 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra Yang Ai Zhenhua Ling 13 13 0 13 May 2023
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings Wei Xue Yiwen Wang Qi-fei Liu Yi-Ting Guo 19 1 0 09 May 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis Ye-Xin Lu Yang Ai Zhenhua Ling 14 1 0 26 Apr 2023
ArmanTTS single-speaker Persian dataset Mohammd Hasan Shamgholi Vahid Saeedi J. Peymanfard Leila Alhabib Hossein Zeinali 19 2 0 07 Apr 2023
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI Chenshuang Zhang Chaoning Zhang Sheng Zheng Mengchun Zhang Maryam Qamar Sung-Ho Bae In So Kweon DiffM MedIm 41 64 0 23 Mar 2023
Speech Modeling with a Hierarchical Transformer Dynamical VAE Xiaoyu Lin Xiaoyu Bie Simon Leglaive Laurent Girin Xavier Alameda-Pineda BDL 24 2 0 07 Mar 2023
Contrast-PLC: Contrastive Learning for Packet Loss Concealment Huaying Xue Xiulian Peng Yan Lu 41 4 0 26 Feb 2023
Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation Shahar Lutati Eliya Nachmani Lior Wolf DiffM 32 14 0 25 Jan 2023
MooseNet: A Trainable Metric for Synthesized Speech with a PLDA Module Ondvrej Plátek Ondrej Dusek 21 2 0 17 Jan 2023
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech Ze Chen Yihan Wu Yichong Leng Jiawei Chen Haohe Liu ... Ke Wang Lei He Sheng Zhao Jiang Bian Danilo P. Mandic DiffM 22 22 0 30 Dec 2022
Semantics-Empowered Communication: A Tutorial-cum-Survey Zhilin Lu Rongpeng Li Kun Lu Xianfu Chen E. Hossain Zhifeng Zhao Honggang Zhang 29 19 0 16 Dec 2022
BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric Mingda Chen Paul-Ambroise Duquenne Pierre Yves Andrews Justine T. Kao Alexandre Mourachko Holger Schwenk Marta R. Costa-jussá 14 17 0 16 Dec 2022
Evaluating and reducing the distance between synthetic and real speech distributions Christoph Minixhofer Ondˇrej Klejch P. Bell 21 7 0 29 Nov 2022
Deep Fake Detection, Deterrence and Response: Challenges and Opportunities Amin Azmoodeh Ali Dehghantanha 29 2 0 26 Nov 2022
Towards Building Text-To-Speech Systems for the Next Billion Users Gokul Karthik Kumar V. PraveenS. Pratyush Kumar Mitesh M. Khapra Karthik Nandakumar 34 18 0 17 Nov 2022
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis Hyeong-Seok Choi Jinhyeok Yang Juheon Lee Hyeongju Kim 16 46 0 17 Nov 2022
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS Ziqi Liang 13 0 0 24 Oct 2022
Adversarial Permutation Invariant Training for Universal Sound Separation Emilian Postolache Jordi Pons Santiago Pascual Joan Serra VLM 18 6 0 21 Oct 2022
Improving robustness of spontaneous speech synthesis with linguistic speech regularization and pseudo-filled-pause insertion Yuta Matsunaga Takaaki Saeki Shinnosuke Takamichi Hiroshi Saruwatari 9 1 0 18 Oct 2022
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules Kazuki Irie Jürgen Schmidhuber 32 5 0 07 Oct 2022
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN Yin-Ping Cho Yu Tsao Hsin-Min Wang Yi-Wen Liu DiffM 20 8 0 21 Sep 2022
Lightweight Long-Range Generative Adversarial Networks Bowen Li Thomas Lukasiewicz GAN 35 3 0 08 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation Zalan Borsos Raphaël Marinier Damien Vincent Eugene Kharitonov Olivier Pietquin ... Dominik Roblek O. Teboul David Grangier Marco Tagliasacchi Neil Zeghidour AuLLM 28 566 0 07 Sep 2022
Exploiting Pre-trained Feature Networks for Generative Adversarial Networks in Audio-domain Loop Generation Yen-Tung Yeh Bo-Yu Chen Yi-Hsuan Yang 32 6 0 05 Sep 2022
Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild Sindhu B. Hegde Prajwal K R Rudrabha Mukhopadhyay Vinay P. Namboodiri C. V. Jawahar 40 10 0 01 Sep 2022
Music Separation Enhancement with Generative Modeling N. Schaffer Boaz Cogan Ethan Manilow Max Morrison Prem Seetharaman Bryan Pardo 15 9 0 26 Aug 2022
Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review Enes ALTUNCU V. N. Franqueira Shujun Li 21 11 0 21 Aug 2022
Generative Extraction of Audio Classifiers for Speaker Identification Tejumade Afonja Lucas Bourtoule Varun Chandrasekaran Sageev Oore Nicolas Papernot AAML 13 1 0 26 Jul 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System Yi-Chiao Wu Patrick Lumban Tobing Kazuki Yasuhara Noriyuki Matsunaga Yamato Ohtani T. Toda 22 0 0 13 Jul 2022