Title |
---|
![]() MIO: A Foundation Model on Multimodal Tokens Zekun Wang King Zhu Chunpu Xu Wangchunshu Zhou Jiaheng Liu ...Yuanxing Zhang Ge Zhang Ke Xu Jie Fu Wenhao Huang |
![]() Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation Siyin Wang Wenyi Yu Yudong Yang Changli Tang Yixuan Li ...Jun Zhang Guangzhi Sun Lu Lu Yuxuan Wang Chao Zhang |
![]() Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based
Speech Recognition Ye Bai Jingping Chen Jitong Chen Wei Chen Zhuo Chen ...Wanyi Zhang Yang Zhang Yawei Zhang Yijie Zheng Ming Zou |
![]() The Interspeech 2024 Challenge on Speech Processing Using Discrete Units Xuankai Chang Jiatong Shi Jinchuan Tian Yuning Wu Yuxun Tang Yihan Wu Shinji Watanabe Yossi Adi Xie Chen Qin Jin |
![]() Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and
Detection Jiachen Lian Carly Feng Naasir Farooqi Steve Li Anshul Kashyap Cheol Jun Cho Peter Wu Robin Netzorg Tingle Li Gopala Krishna Anumanchipalli |