Speaking Style Conversion in the Waveform Domain Using Discrete
Self-Supervised Units

Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units

19 December 2022

Gallil Maimon

Yossi Adi

Papers citing "Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units"

17 / 17 papers shown

Title
On The Landscape of Spoken Language Models: A Comprehensive Survey Siddhant Arora Kai-Wei Chang Chung-Ming Chien Yifan Peng Haibin Wu Yossi Adi Emmanuel Dupoux Hung-yi Lee Karen Livescu Shinji Watanabe 52 2 0 11 Apr 2025
Scaling Analysis of Interleaved Speech-Text Language Models Gallil Maimon Michael Hassid Amit Roth Yossi Adi AuLLM 43 0 0 03 Apr 2025
Slamming: Training a Speech Language Model on One GPU in a Day Gallil Maimon Avishai Elmakies Yossi Adi 38 3 0 19 Feb 2025
Emotion Recognition and Generation: A Comprehensive Review of Face, Speech, and Text Modalities Rebecca Mobbs Dimitrios Makris Vasileios Argyriou 43 0 0 02 Feb 2025
EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion Ashishkumar Gudmalwar Ishan D. Biyani Nirmesh J. Shah Pankaj Wasnik R. Shah DiffM 26 0 0 31 Dec 2024
A Pilot Study of Applying Sequence-to-Sequence Voice Conversion to Evaluate the Intelligibility of L2 Speech Using a Native Speaker's Shadowings Haopeng Geng Daisuke Saito N. Minematsu 18 1 0 03 Oct 2024
Simulating Native Speaker Shadowing for Nonnative Speech Assessment with Latent Speech Representations Haopeng Geng Daisuke Saito Nobuaki Minematsu 25 0 0 18 Sep 2024
Salmon: A Suite for Acoustic Language Model Evaluation Gallil Maimon Amit Roth Yossi Adi ELM AuLLM 51 5 0 11 Sep 2024
NAST: Noise Aware Speech Tokenization for Speech Language Models Shoval Messica Yossi Adi 30 6 0 16 Jun 2024
The VoicePrivacy 2024 Challenge Evaluation Plan N. Tomashenko Xiaoxiao Miao Pierre Champion Sarina Meyer Xin Wang Emmanuel Vincent Michele Panariello Nicholas W. D. Evans Junichi Yamagishi Massimiliano Todisco 36 21 0 03 Apr 2024
Scaling Speech Technology to 1,000+ Languages Vineel Pratap Andros Tjandra Bowen Shi Paden Tomasello Arun Babu ... Yossi Adi Xiaohui Zhang Wei-Ning Hsu Alexis Conneau Michael Auli VLM 77 300 0 22 May 2023
AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation Guy Yariv Itai Gat Lior Wolf Yossi Adi Idan Schwartz DiffM 20 20 0 22 May 2023
Back Translation for Speech-to-text Translation Without Transcripts Qingkai Fang Yang Feng 30 13 0 15 May 2023
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model Rui Xue Yanqing Liu Lei He Xuejiao Tan Linquan Liu Ed Lin Sheng Zhao 28 7 0 06 Mar 2023
fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit Changhan Wang Wei-Ning Hsu Yossi Adi Adam Polyak Ann Lee Peng-Jen Chen Jiatao Gu J. Pino VLM 69 32 0 14 Sep 2021
Generative Spoken Language Modeling from Raw Audio Kushal Lakhotia Evgeny Kharitonov Wei-Ning Hsu Yossi Adi Adam Polyak ... Tu Nguyen Jade Copet Alexei Baevski A. Mohamed Emmanuel Dupoux AuLLM 185 337 0 01 Feb 2021
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations Wen-Chin Huang Yi-Chiao Wu Tomoki Hayashi T. Toda BDL 39 37 0 23 Oct 2020