Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.15066
Cited By
Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides
21 April 2025
Jinghua Zhao
Yuhang Jia
Shiyao Wang
Jiaming Zhou
Hui Wang
Yong Qin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides"
2 / 2 papers shown
Title
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Weiyun Wang
Zhe Chen
Wenhai Wang
Yue Cao
Yangzhou Liu
...
Jinguo Zhu
X. Zhu
Lewei Lu
Yu Qiao
Jifeng Dai
LRM
145
93
1
15 Nov 2024
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
111
1
0
13 Sep 2024
1