FloWaveNet : A Generative Flow for Raw Audio

6 November 2018

Papers citing "FloWaveNet : A Generative Flow for Raw Audio"

50 / 108 papers shown

Title
Memory-Centric Computing: Recent Advances in Processing-in-DRAM O. Mutlu Ataberk Olgun Geraldo F. Oliveira Ismail Emir Yüksel 58 5 0 26 Dec 2024
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation Sang-Hoon Lee Ha-Yeong Choi Seong-Whan Lee OOD DiffM AI4TS 55 5 0 14 Aug 2024
Stochastic Parrots or ICU Experts? Large Language Models in Critical Care Medicine: A Scoping Review Tongyue Shi Jun Ma Zihan Yu Haowei Xu Minqi Xiong Meirong Xiao Yilin Li Huiying Zhao Guilan Kong 50 1 0 27 Jul 2024
A Survey of Deep Learning Audio Generation Methods Matej Bozic Marko Horvat VLM MedIm 66 0 0 31 May 2024
Large Language Models for Medicine: A Survey Yanxin Zheng Wensheng Gan Zefeng Chen Zhenlian Qi Qian Liang Philip S. Yu LM&MA 33 16 0 20 May 2024
Music Style Transfer With Diffusion Model Hong Huang Yuyi Wang Luyao Li Jun Lin DiffM 27 0 0 23 Apr 2024
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models Xiang Li Fan Bu Ambuj Mehrish Yingting Li Jiale Han Bo Cheng Soujanya Poria DiffM 40 6 0 31 Mar 2024
E3 TTS: Easy End-to-End Diffusion-based Text to Speech Yuan Gao Nobuyuki Morioka Yu Zhang Nanxin Chen DiffM 36 27 0 02 Nov 2023
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling Tiberiu Boros Stefan Daniel Dumitrescu Ionut Mironica Radu Chivereanu GAN 22 1 0 14 Oct 2023
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network Takashi Shibuya Yuhta Takida Yuki Mitsufuji 23 11 0 06 Sep 2023
On the Approximation of Bi-Lipschitz Maps by Invertible Neural Networks Bangti Jin Zehui Zhou Jun Zou 31 3 0 18 Aug 2023
An Overview on Generative AI at Scale with Edge-Cloud Computing Yun Cheng Wang Jintang Xue Chengwei Wei C.-C. Jay Kuo 24 30 0 02 Jun 2023
Deep Fake Detection, Deterrence and Response: Challenges and Opportunities Amin Azmoodeh Ali Dehghantanha 45 2 0 26 Nov 2022
STGlow: A Flow-based Generative Framework with Dual Graphormer for Pedestrian Trajectory Prediction Rongqin Liang Yuanman Li Jiantao Zhou Xia Li 39 12 0 21 Nov 2022
Audio Time-Scale Modification with Temporal Compressing Networks Ernie Chu Ju-Ting Chen Chia-Ping Chen 25 0 0 31 Oct 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System Yi-Chiao Wu Patrick Lumban Tobing Kazuki Yasuhara Noriyuki Matsunaga Yamato Ohtani T. Toda 42 0 0 13 Jul 2022
Avocodo: Generative Adversarial Network for Artifact-free Vocoder Taejun Bak Junmo Lee Hanbin Bae Jinhyeok Yang Jaesung Bae Young-Sun Joo 27 28 0 27 Jun 2022
WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis Yi Wang Yi Si 28 0 0 20 Jun 2022
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks Shanghua Gao Zhong-Yu Li Qi Han Ming-Ming Cheng Liang Wang 39 34 0 14 Jun 2022
BigVGAN: A Universal Neural Vocoder with Large-Scale Training Sang-gil Lee Ming-Yu Liu Boris Ginsburg Bryan Catanzaro Sung-Hoon Yoon 33 230 0 09 Jun 2022
Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data Sungwon Kim Heeseung Kim Sung-Hoon Yoon DiffM 204 52 0 30 May 2022
Parallel Synthesis for Autoregressive Speech Generation Po-Chun Hsu Da-Rong Liu Andy T. Liu Hung-yi Lee 42 5 0 25 Apr 2022
Universal approximation property of invertible neural networks Isao Ishikawa Takeshi Teshima Koichi Tojo Kenta Oono Masahiro Ikeda Masashi Sugiyama 49 30 0 15 Apr 2022
TO-FLOW: Efficient Continuous Normalizing Flows with Temporal Optimization adjoint with Moving Speed Shian Du Yihong Luo Wei Chen Jian Xu Delu Zeng 37 7 0 19 Mar 2022
Wavebender GAN: An architecture for phonetically meaningful speech manipulation Gustavo Teodoro Döhler Beck Ulme Wennberg Zofia Malisz G. Henter AI4CE 32 8 0 22 Feb 2022
It's Raw! Audio Generation with State-Space Models Karan Goel Albert Gu Chris Donahue Christopher Ré 25 187 0 20 Feb 2022
Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis Yu Wang Xinsheng Wang Pengcheng Zhu Jie Wu Hanzhao Li Heyang Xue Yongmao Zhang Lei Xie Mengxiao Bi 25 97 0 19 Jan 2022
Audio representations for deep learning in sound synthesis: A review Anastasia Natsiou Seán O'Leary AI4TS 30 18 0 07 Jan 2022
Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance Heeseung Kim Sungwon Kim Sungroh Yoon DiffM BDL 19 107 0 23 Nov 2021
Approaching the Limit of Image Rescaling via Flow Guidance Shangzhou Li Guixuan Zhang Zhengxiong Luo Jie Liu Zhi Zeng Shuwu Zhang 39 9 0 09 Nov 2021
WaveFake: A Data Set to Facilitate Audio Deepfake Detection Joel Frank Lea Schonherr DiffM 132 125 0 04 Nov 2021
CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation Aditya Sanghi Hang Chu Joseph G. Lambourne Ye Wang Chin-Yi Cheng Marco Fumero Kamal Rahimi Malekshan CLIP 60 289 0 06 Oct 2021
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis Manh Luong Viet-Anh Tran 14 2 0 27 Sep 2021
Normalizing field flows: Solving forward and inverse stochastic differential equations using physics-informed flow models Ling Guo Hao Wu Tao Zhou AI4CE 14 45 0 30 Aug 2021
Integrated Speech and Gesture Synthesis Siyang Wang Simon Alexanderson Joakim Gustafson Jonas Beskow G. Henter Éva Székely 37 19 0 25 Aug 2021
Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling Christos Sakaridis Andreas Lugmayr Peng Sun Martin Danelljan Luc Van Gool Radu Timofte 48 102 0 11 Aug 2021
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing Zhaofeng Shi 26 7 0 01 Aug 2021
PU-Flow: a Point Cloud Upsampling Network with Normalizing Flows Aihua Mao Zihui Du Junhui Hou Yaqi Duan Yong-jin Liu Ying He 3DPC 37 35 0 13 Jul 2021
Energy Consumption of Deep Generative Audio Models Constance Douwes P. Esling Jean-Pierre Briot MedIm 22 13 0 06 Jul 2021
A Survey on Neural Speech Synthesis Xu Tan Tao Qin Frank Soong Tie-Yan Liu AI4TS 18 353 0 29 Jun 2021
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Nanxin Chen Yu Zhang Heiga Zen Ron J. Weiss Mohammad Norouzi Najim Dehak William Chan DiffM 23 88 0 17 Jun 2021
Improving the expressiveness of neural vocoding with non-affine Normalizing Flows Adam Gabry's Yunlong Jiao V. Klimkov Daniel Korzekwa Roberto Barra-Chicote 15 1 0 16 Jun 2021
Catch-A-Waveform: Learning to Generate Audio from a Single Short Example Gal Greshler Tamar Rott Shaham T. Michaeli 18 25 0 11 Jun 2021
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Ji-Hoon Kim Sang-Hoon Lee Ji-Hyun Lee Seong-Whan Lee 24 53 0 04 Jun 2021
Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics V. Jayaram John Thickstun DiffM 28 23 0 17 May 2021
ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation Shoule Wu Ziqiang Shi DiffM 27 11 0 17 May 2021
Review of end-to-end speech synthesis technology based on deep learning Zhaoxi Mu Xinyu Yang Yizhuo Dong AuLLM ALM 26 24 0 20 Apr 2021
Unified Source-Filter GAN: Unified Source-filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN Reo Yoneyama Yi-Chiao Wu T. Toda 14 12 0 10 Apr 2021
Flow-based Kernel Prior with Application to Blind Super-Resolution Christos Sakaridis Peng Sun Shuhang Gu Luc Van Gool Radu Timofte SupR 22 127 0 29 Mar 2021
Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN Cong Wang Yu Chen Bin Wang Yi Shi 35 1 0 26 Mar 2021