Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.04917
Cited By
Continuous Speech Tokens Makes LLMs Robust Multi-Modality Learners
6 December 2024
Ze Yuan
Yanqing Liu
Shujie Liu
Sheng Zhao
AuLLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Continuous Speech Tokens Makes LLMs Robust Multi-Modality Learners"
7 / 7 papers shown
Title
IMPACT: Iterative Mask-based Parallel Decoding for Text-to-Audio Generation with Diffusion Modeling
Kuan Po Huang
Shu-Wen Yang
Huy Phan
Bo-Ru Lu
Byeonggeun Kim
...
Qingming Tang
Shalini Ghosh
Hung-yi Lee
Chieh-Chi Kao
Chao Wang
36
0
0
31 May 2025
On The Landscape of Spoken Language Models: A Comprehensive Survey
Siddhant Arora
Kai-Wei Chang
Chung-Ming Chien
Yifan Peng
Haibin Wu
Yossi Adi
Emmanuel Dupoux
Hung-yi Lee
Karen Livescu
Shinji Watanabe
161
14
0
11 Apr 2025
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Qinglin Zhang
Luyao Cheng
Chong Deng
Qian Chen
Wen Wang
...
Jiaqing Liu
Hai Yu
Chaohong Tan
Zhihao Du
Shiliang Zhang
SyDa
BDL
AuLLM
VLM
146
20
0
23 Oct 2024
Ichigo: Mixed-Modal Early-Fusion Realtime Voice Assistant
Alan Dao
Dinh Bach Vu
Huy Hoang Ha
AuLLM
VLM
143
5
0
20 Oct 2024
Recent Advances in Speech Language Models: A Survey
Wenqian Cui
Dianzhi Yu
Xiaoqi Jiao
Ziqiao Meng
Guangyan Zhang
Qichao Wang
Yiwen Guo
Irwin King
AuLLM
211
26
0
01 Oct 2024
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Kai Chen
Yunhao Gou
Runhui Huang
Zhili Liu
Daxin Tan
...
Qun Liu
Jun Yao
Lu Hou
Hang Xu
Hang Xu
AuLLM
MLLM
VLM
192
29
0
26 Sep 2024
Autoregressive Speech Synthesis without Vector Quantization
Lingwei Meng
Long Zhou
Shujie Liu
Sanyuan Chen
Bing Han
...
Jinyu Li
Sheng Zhao
Xixin Wu
Helen M. Meng
Furu Wei
174
43
0
11 Jul 2024
1