v1v2 (latest)

Efficient Neural Audio Synthesis

23 February 2018

Papers citing "Efficient Neural Audio Synthesis"

50 / 469 papers shown

Title
E3 TTS: Easy End-to-End Diffusion-based Text to Speech Yuan Gao Nobuyuki Morioka Yu Zhang Nanxin Chen DiffM 87 33 0 02 Nov 2023
An overview of text-to-speech systems and media applications Mohammad Reza Hasanabadi 28 3 0 22 Oct 2023
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling Tiberiu Boros Stefan Daniel Dumitrescu Ionut Mironica Radu Chivereanu GAN 38 1 0 14 Oct 2023
Deepfake audio as a data augmentation technique for training automatic speech to text transcription models Alexandre R. Ferreira Cláudio E. C. Campelo 34 1 0 22 Sep 2023
SpeechAlign: a Framework for Speech Translation Alignment Evaluation Belen Alastruey Aleix Sant Gerard I. Gállego David Dale Marta R. Costa-jussá AuLLM 59 3 0 20 Sep 2023
Speech Synthesis By Unrolling Diffusion Process using Neural Network Layers Peter Ochieng DiffM 56 0 0 18 Sep 2023
A Two-Stage Training Framework for Joint Speech Compression and Enhancement Jiayi Huang Zeyu Yan Wenbin Jiang Fei Wen 61 1 0 08 Sep 2023
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network Takashi Shibuya Yuhta Takida Yuki Mitsufuji 71 11 0 06 Sep 2023
Voice Morphing: Two Identities in One Voice Sushant Pani Anurag Chowdhury Morgan Sandler Arun Ross 79 1 0 05 Sep 2023
A Review of Differentiable Digital Signal Processing for Music & Speech Synthesis B. Hayes Jordie Shier Gyorgy Fazekas Andrew Mcpherson C. Saitis 83 25 0 29 Aug 2023
iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Shogo Seki 82 5 0 14 Aug 2023
Neural Networks at a Fraction with Pruned Quaternions Sahel Mohammad Iqbal Subhankar Mishra 80 4 0 13 Aug 2023
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design Jungil Kong Jihoon Park Beomjeong Kim Jeongmin Kim Dohee Kong Sangjin Kim 59 41 0 31 Jul 2023
Meta-Transformer: A Unified Framework for Multimodal Learning Yiyuan Zhang Kaixiong Gong Kaipeng Zhang Hongsheng Li Yu Qiao Wanli Ouyang Xiangyu Yue 105 150 0 20 Jul 2023
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading Yujia Xiao Shaofei Zhang Xi Wang Xuejiao Tan Lei He Sheng Zhao Frank Soong Tan Lee 46 6 0 03 Jul 2023
MFCCGAN: A Novel MFCC-Based Speech Synthesizer Using Adversarial Learning Mohammad Reza Hasanabadi 26 3 0 22 Jun 2023
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading Yochai Yemini Aviv Shamsian Lior Bracha Sharon Gannot Ethan Fetaya DiffM 116 15 0 05 Jun 2023
Towards Robust FastSpeech 2 by Modelling Residual Multimodality Fabian Kögel Bac Nguyen Fabien Cardinaux 55 2 0 02 Jun 2023
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis Hubert Siuzdak 148 104 0 01 Jun 2023
Towards Selection of Text-to-speech Data to Augment ASR Training Shuo Liu Leda Sari Chunyang Wu Gil Keren Yuan Shangguan Jay Mahadeokar Ozlem Kalinli 38 5 0 30 May 2023
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus Yuma Koizumi Heiga Zen Shigeki Karita Yifan Ding Kohei Yatabe Nobuyuki Morioka M. Bacchiani Yu Zhang Wei Han Ankur Bapna 117 80 0 30 May 2023
Translatotron 3: Speech to Speech Translation with Monolingual Data Eliya Nachmani Alon Levkovitch Yi-Yang Ding Chulayutsh Asawaroengchai Heiga Zen Michelle Tadmor Ramanovich 91 15 0 27 May 2023
Efficient Neural Music Generation Max W. Y. Lam Qiao Tian Tang-Chun Li Zongyu Yin Siyuan Feng ... Mingbo Ma Xuchen Song Jitong Chen Yuping Wang Yuxuan Wang DiffM MGen 95 56 0 25 May 2023
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis based on Frequency Modulation Zhe Ye Wei Xue Xuejiao Tan Qi-fei Liu Yi-Ting Guo 79 2 0 22 May 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra Yang Ai Zhenhua Ling 101 14 0 13 May 2023
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers L. Yu Daniel Simig Colin Flaherty Armen Aghajanyan Luke Zettlemoyer M. Lewis 116 93 0 12 May 2023
JaxPruner: A concise library for sparsity research Jooyoung Lee Wonpyo Park Nicole Mitchell Jonathan Pilault J. Obando-Ceron ... Hong-Seok Kim Yann N. Dauphin Karolina Dziugaite Pablo Samuel Castro Utku Evci 120 16 0 27 Apr 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis Ye-Xin Lu Yang Ai Zhenhua Ling 105 1 0 26 Apr 2023
AI-Synthesized Voice Detection Using Neural Vocoder Artifacts Chengzhe Sun Shan Jia Shuwei Hou Siwei Lyu 72 45 0 25 Apr 2023
SAR: Self-Supervised Anti-Distortion Representation for End-To-End Speech Model Jianzong Wang Xulong Zhang Haobin Tang Aolan Sun Ning Cheng Jing Xiao 129 1 0 23 Apr 2023
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models Yuancheng Wang Zeqian Ju Xuejiao Tan Lei He Zhizheng Wu Jiang Bian Sheng Zhao DiffM 154 55 0 03 Apr 2023
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Shogo Seki 52 9 0 24 Mar 2023
Controllable Prosody Generation With Partial Inputs Dan-Andrei Iliescu D. Mohan Tian Huey Teh Zack Hodari 48 1 0 14 Mar 2023
TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets Weixin Chen Basel Alomair Yue Liu DiffM 108 81 0 10 Mar 2023
Scaling strategies for on-device low-complexity source separation with Conv-Tasnet Mohamed Nabih Ali Francesco Paissan Daniele Falavigna Alessio Brutti 53 2 0 06 Mar 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model Rui Xue Yanqing Liu Lei He Xuejiao Tan Linquan Liu Ed Lin Sheng Zhao 118 7 0 06 Mar 2023
A General Framework for Learning Procedural Audio Models of Environmental Sounds Danzel Serrano M. Cartwright DiffM DRL 63 1 0 04 Mar 2023
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations Yuma Koizumi Heiga Zen Shigeki Karita Yifan Ding Kohei Yatabe Nobuyuki Morioka Yu Zhang Wei Han Ankur Bapna M. Bacchiani 94 29 0 03 Mar 2023
Hello Me, Meet the Real Me: Audio Deepfake Attacks on Voice Assistants Domna Bilika Nikoletta Michopoulou E. Alepis Constantinos Patsakis 69 10 0 20 Feb 2023
Exposing AI-Synthesized Human Voices Using Neural Vocoder Artifacts Chengzhe Sun Shan Jia Shuwei Hou Ehab AlBadawy Siwei Lyu 163 3 0 18 Feb 2023
Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation Shahar Lutati Eliya Nachmani Lior Wolf DiffM 77 16 0 25 Jan 2023
Generative Emotional AI for Speech Emotion Recognition: The Case for Synthetic Emotional Speech Augmentation Abdullah Shahid S. Latif Junaid Qadir 64 23 0 10 Jan 2023
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech Ze Chen Yihan Wu Yichong Leng Jiawei Chen Haohe Liu ... Ke Wang Lei He Sheng Zhao Jiang Bian Danilo Mandic DiffM 108 23 0 30 Dec 2022
MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset Kailin Liang Bin Liu Yifan Hu Rui Liu F. Bao Guanglai Gao 76 1 0 11 Dec 2022
Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with Very Low Computational Complexity Ahmed Mustafa J. Valin Jan Büthe Paris Smaragdis Mike Goodwin 54 4 0 08 Dec 2022
Evince the artifacts of Spoof Speech by blending Vocal Tract and Voice Source Features T. U. K. Reddy Sahukari Chaitanya Varun Kota Pranav Kumar Sankala Sreekanth K. Murty 117 0 0 05 Dec 2022
Are Straight-Through gradients and Soft-Thresholding all you need for Sparse Training? A. Vanderschueren Christophe De Vleeschouwer MQ 56 9 0 02 Dec 2022
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts Trevor Gale Deepak Narayanan C. Young Matei A. Zaharia MoE 81 109 0 29 Nov 2022
Efficient Incremental Text-to-Speech on GPUs Muyang Du Chuan Liu Jiaxing Qi Junjie Lai 52 1 0 25 Nov 2022
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing J. Webber Cassia Valentini-Botinhao Evelyn Williams G. Henter Simon King 111 9 0 13 Nov 2022