Title |
|---|
| Name | # Papers | # Citations |
|---|---|---|
| Date | Location | Event |
|---|---|---|
Exploring the development and application of large language models specifically tailored for audio data processing and understanding.
Title |
|---|
Title | |||
|---|---|---|---|
![]() DisCo-Speech: Controllable Zero-Shot Speech Generation with A Disentangled Speech Codec Tao Li Wengshuo Ge Zhichao Wang Zihao Cui Yong Ma Yingying Gao Chao Deng Shilei Zhang Junlan Feng | |||
![]() JointAVBench: A Benchmark for Joint Audio-Visual Reasoning Evaluation Jianghan Chao Jianzhang Gao Wenhui Tan Yuchong Sun Ruihua Song Liyun Ru | |||
![]() Spoken Conversational Agents with Large Language Models Chao-Han Huck Yang Andreas Stolcke Larry Heck | |||
Cross-Lingual Interleaving for Speech Language Models Adel Moumen Guangzhi Sun Philip C. Woodland | |||
![]() See, Hear, and Understand: Benchmarking Audiovisual Human Speech Understanding in Multimodal Large Language Models Le Thien Phuc Nguyen Zhuoran Yu Samuel Low Yu Hang Subin An Jeongik Lee ...SeungEun Chung Thanh-Huy Nguyen JuWan Maeng Soochahn Lee Yong Jae Lee | |||
![]() MAC-SLU: Multi-Intent Automotive Cabin Spoken Language Understanding Benchmark Yuezhang Peng Chonghao Cai Ziang Liu Shuai Fan Sheng Jiang ...Kele Xu Yao Li Sheng Wang Libo Qin Xie Chen | |||
![]() ORCA: Open-ended Response Correctness Assessment for Audio Question Answering Šimon Sedláček Sara Barahona Bolaji Yusuf Laura Herrera-Alarcón Santosh Kesiraju ...Sathvik Udupa Fernando López Allison Ferner Ramani Duraiswami Jan Černocký | |||
| Name (-) |
|---|
| Name (-) |
|---|
| Name (-) |
|---|
| Date | Location | Event | |
|---|---|---|---|
| No social events available | |||