Title |
---|
![]() Large Language Models are Strong Audio-Visual Speech Recognition Learners Umberto Cappellazzo Minsu Kim Honglie Chen Pingchuan Ma Stavros Petridis Daniele Falavigna Alessio Brutti Maja Pantic |
![]() Joint Speaker Features Learning for Audio-visual Multichannel Speech
Separation and Recognition Guinan Li Jiajun Deng Youjun Chen Mengzhe Geng Shujie Hu ...Zengrui Jin Tianzi Wang Xurong Xie Helen Meng Xunying Liu |
![]() AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation Rongjie Huang Huadai Liu Xize Cheng Yi Ren Lin Li ...Jinzheng He Lichao Zhang Jinglin Liu Xiaoyue Yin Zhou Zhao |