ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.03264
  4. Cited By
Text-Free Prosody-Aware Generative Spoken Language Modeling

Text-Free Prosody-Aware Generative Spoken Language Modeling

7 September 2021
Eugene Kharitonov
Ann Lee
Adam Polyak
Yossi Adi
Jade Copet
Kushal Lakhotia
Tu Nguyen
M. Rivière
Abdel-rahman Mohamed
Emmanuel Dupoux
Wei-Ning Hsu
ArXivPDFHTML

Papers citing "Text-Free Prosody-Aware Generative Spoken Language Modeling"

32 / 32 papers shown
Title
DOSE : Drum One-Shot Extraction from Music Mixture
DOSE : Drum One-Shot Extraction from Music Mixture
Suntae Hwang
Seonghyeon Kang
Kyungsu Kim
Semin Ahn
K. Lee
36
0
0
25 Apr 2025
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
Xue Jiang
Xiulian Peng
Yuan Zhang
Yan-Heng Lu
SSL
83
0
0
15 Mar 2025
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
Sylber: Syllabic Embedding Representation of Speech from Raw Audio
Cheol Jun Cho
Nicholas Lee
Akshat Gupta
Dhruv Agarwal
Ethan Chen
Alan W Black
Gopala K. Anumanchipalli
32
0
0
09 Oct 2024
Audio-Agent: Leveraging LLMs For Audio Generation, Editing and Composition
Audio-Agent: Leveraging LLMs For Audio Generation, Editing and Composition
Zixuan Wang
Chi-Keung Tang
Chi-Keung Tang
DiffM
VGen
LLMAG
43
4
0
04 Oct 2024
Recent Advances in Speech Language Models: A Survey
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
59
14
0
01 Oct 2024
Salmon: A Suite for Acoustic Language Model Evaluation
Salmon: A Suite for Acoustic Language Model Evaluation
Gallil Maimon
Amit Roth
Yossi Adi
ELM
AuLLM
49
5
0
11 Sep 2024
Language Model Can Listen While Speaking
Language Model Can Listen While Speaking
Ziyang Ma
Yakun Song
Chenpeng Du
Jian Cong
Zhuo Chen
Yuping Wang
Y. Wang
Xie Chen
AuLLM
34
23
0
05 Aug 2024
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue
  Language Modeling
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling
Wataru Nakata
Kentaro Seki
Hitomi Yanaka
Yuki Saito
Shinnosuke Takamichi
Hiroshi Saruwatari
AuLLM
43
0
0
22 Jul 2024
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Xuankai Chang
Jiatong Shi
Jinchuan Tian
Yuning Wu
Yuxun Tang
Yihan Wu
Shinji Watanabe
Yossi Adi
Xie Chen
Qin Jin
43
15
0
11 Jun 2024
MAD Speech: Measures of Acoustic Diversity of Speech
MAD Speech: Measures of Acoustic Diversity of Speech
Matthieu Futeral
A. Agostinelli
Marco Tagliasacchi
Neil Zeghidour
Eugene Kharitonov
51
1
0
16 Apr 2024
Advancing Large Language Models to Capture Varied Speaking Styles and
  Respond Properly in Spoken Conversations
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations
Guan-Ting Lin
Cheng-Han Chiang
Hung-yi Lee
34
22
0
20 Feb 2024
SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in HuBERT
SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in HuBERT
Cheol Jun Cho
Abdelrahman Mohamed
Shang-Wen Li
Alan W. Black
Gopala K. Anumanchipalli
29
8
0
16 Oct 2023
Toward Joint Language Modeling for Speech Units and Text
Toward Joint Language Modeling for Speech Units and Text
Ju-Chieh Chou
Chung-Ming Chien
Wei-Ning Hsu
Karen Livescu
Arun Babu
Alexis Conneau
Alexei Baevski
Michael Auli
VLM
26
20
0
12 Oct 2023
Improving Textless Spoken Language Understanding with Discrete Units as
  Intermediate Target
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target
Guanyong Wu
Guan-Ting Lin
Shang-Wen Li
Hung-yi Lee
21
5
0
29 May 2023
Phonetic and Prosody-aware Self-supervised Learning Approach for
  Non-native Fluency Scoring
Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring
Kaiqi Fu
Shaojun Gao
Shuju Shi
Xiaohai Tian
Wei Li
Zejun Ma
18
2
0
19 May 2023
DUB: Discrete Unit Back-translation for Speech Translation
DUB: Discrete Unit Back-translation for Speech Translation
Dong Zhang
Rong Ye
Tom Ko
Mingxuan Wang
Yaqian Zhou
13
23
0
19 May 2023
Back Translation for Speech-to-text Translation Without Transcripts
Back Translation for Speech-to-text Translation Without Transcripts
Qingkai Fang
Yang Feng
30
13
0
15 May 2023
Speaking Style Conversion in the Waveform Domain Using Discrete
  Self-Supervised Units
Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units
Gallil Maimon
Yossi Adi
21
13
0
19 Dec 2022
RWEN-TTS: Relation-aware Word Encoding Network for Natural
  Text-to-Speech Synthesis
RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech Synthesis
Shinhyeok Oh
HyeongRae Noh
Yoonseok Hong
Insoo Oh
18
0
0
15 Dec 2022
A unified one-shot prosody and speaker conversion system with
  self-supervised discrete speech units
A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Li-Wei Chen
Shinji Watanabe
Alexander I. Rudnicky
18
6
0
12 Nov 2022
Audio Language Modeling using Perceptually-Guided Discrete
  Representations
Audio Language Modeling using Perceptually-Guided Discrete Representations
Felix Kreuk
Yaniv Taigman
Adam Polyak
Jade Copet
Gabriel Synnaeve
Alexandre Défossez
Yossi Adi
27
4
0
02 Nov 2022
Self-supervised language learning from raw audio: Lessons from the Zero
  Resource Speech Challenge
Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge
Ewan Dunbar
Nicolas Hamilakis
Emmanuel Dupoux
SSL
29
30
0
27 Oct 2022
Bootstrapping meaning through listening: Unsupervised learning of spoken
  sentence embeddings
Bootstrapping meaning through listening: Unsupervised learning of spoken sentence embeddings
Jian Zhu
Zuoyu Tian
Yadong Liu
Cong Zhang
Chia-wen Lo
SSL
30
2
0
23 Oct 2022
GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from
  Diffusion Models
GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models
Matthew Baas
Herman Kamper
DiffM
32
8
0
11 Oct 2022
An Overview of Affective Speech Synthesis and Conversion in the Deep
  Learning Era
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era
Andreas Triantafyllopoulos
Björn W. Schuller
Gokcce .Iymen
M. Sezgin
Xiangheng He
...
Shuo Liu
Silvan Mertes
Elisabeth André
Ruibo Fu
Jianhua Tao
15
53
0
06 Oct 2022
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
25
289
0
30 Sep 2022
TVLT: Textless Vision-Language Transformer
TVLT: Textless Vision-Language Transformer
Zineng Tang
Jaemin Cho
Yixin Nie
Mohit Bansal
VLM
51
28
0
28 Sep 2022
AudioLM: a Language Modeling Approach to Audio Generation
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
41
567
0
07 Sep 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised
  Pre-training and Data Augmentation
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation
Sravya Popuri
Peng-Jen Chen
Changhan Wang
J. Pino
Yossi Adi
Jiatao Gu
Wei-Ning Hsu
Ann Lee
20
56
0
06 Apr 2022
Textless Speech-to-Speech Translation on Real Data
Textless Speech-to-Speech Translation on Real Data
Ann Lee
Hongyu Gong
Paul-Ambroise Duquenne
Holger Schwenk
Peng-Jen Chen
...
Sravya Popuri
Yossi Adi
J. Pino
Jiatao Gu
Wei-Ning Hsu
26
142
0
15 Dec 2021
Generative Spoken Language Modeling from Raw Audio
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
182
336
0
01 Feb 2021
1