Title |
---|
![]() MIO: A Foundation Model on Multimodal Tokens Zekun Wang King Zhu Chunpu Xu Wangchunshu Zhou Jiaheng Liu ...Yuanxing Zhang Ge Zhang Ke Xu Jie Fu Wenhao Huang |
![]() Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation Siyin Wang Wenyi Yu Yudong Yang Changli Tang Yixuan Li ...Jun Zhang Guangzhi Sun Lu Lu Yuxuan Wang Chao Zhang |
![]() OmniBench: Towards The Future of Universal Omni-Language Models Yizhi Li Ge Zhang Yinghao Ma Ruibin Yuan Kang Zhu ...Zhaoxiang Zhang Zachary Liu Emmanouil Benetos Wenhao Huang Chenghua Lin |
![]() Chain-of-Thought Prompting for Speech Translation Ke Hu Zhehuai Chen Chao-Han Huck Yang Piotr Żelasko Oleksii Hrinchuk Vitaly Lavrukhin Jagadeesh Balam Boris Ginsburg |
![]() WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Shengpeng Ji Ziyue Jiang Xize Cheng Yifu Chen Minghui Fang ...Rongjie Huang Yidi Jiang Qian Chen Zhou Zhao Zhou Zhao |