
v1v2 (latest)
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities
Papers citing "SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities"
40 / 40 papers shown
Title |
---|
![]() MIO: A Foundation Model on Multimodal Tokens Zekun Wang King Zhu Chunpu Xu Wangchunshu Zhou Jiaheng Liu ...Yuanxing Zhang Ge Zhang Ke Xu Jie Fu Wenhao Huang |
![]() WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Shengpeng Ji Ziyue Jiang Xize Cheng Yifu Chen Minghui Fang ...Rongjie Huang Yidi Jiang Qian Chen Zhou Zhao Zhou Zhao |