iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating
Inverse Short-Time Fourier Transform

iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform

4 March 2022

Takuhiro Kaneko

Hirokazu Kameoka

Papers citing "iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform"

13 / 13 papers shown

Title
Designing Neural Synthesizers for Low-Latency Interaction Franco Caspe Jordie Shier Mark Sandler C. Saitis Andrew Mcpherson 177 0 0 14 Mar 2025
Less is More for Synthetic Speech Detection in the Wild Ashi Garg Zexin Cai Henry Li Xinyuan Leibny Paola García-Perera Kevin Duh Sanjeev Khudanpur Matthew Wiesner Nicholas Andrews 74 0 0 17 Feb 2025
FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter Yuanjun Lv Hai Li Ying Yan Junhui Liu Danming Xie Lei Xie 48 1 0 12 Jun 2024
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias Sipan Li Songxiang Liu Lu Zhang Xiang Li Yanyao Bian Chao Weng Zhiyong Wu Helen Meng 36 2 0 14 Sep 2023
FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs Won Jang D. Lim Heayoung Park 27 1 0 18 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra Yang Ai Zhenhua Ling 34 13 0 13 May 2023
Msanii: High Fidelity Music Synthesis on a Shoestring Budget Kinyugo Maina 27 5 0 16 Jan 2023
Puffin: pitch-synchronous neural waveform generation for fullband speech on modest devices O. Watts Lovisa Wihlborg Cassia Valentini-Botinhao 25 3 0 25 Nov 2022
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing J. Webber Cassia Valentini-Botinhao Evelyn Williams G. Henter Simon King 11 9 0 13 Nov 2022
WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration Yuma Koizumi Kohei Yatabe Heiga Zen M. Bacchiani DiffM 42 29 0 03 Oct 2022
A Deep Reinforcement Learning Blind AI in DareFightingICE Thai Van Nguyen Xincheng Dai Ibrahim Khan R. Thawonmas H. V. Pham VLM 23 7 0 16 May 2022
MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Nobukatsu Hojo 33 57 0 25 Feb 2021
High Fidelity Speech Synthesis with Adversarial Networks Mikolaj Binkowski Jeff Donahue Sander Dieleman Aidan Clark Erich Elsen Norman Casagrande Luis C. Cobo Karen Simonyan 235 239 0 25 Sep 2019