Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.09759
Cited By
SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition
18 January 2024
Hao Wang
Shuhei Kurita
Shuichiro Shimizu
Daisuke Kawahara
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition"
4 / 4 papers shown
Title
Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides
Jinghua Zhao
Yuhang Jia
Shiyao Wang
Jiaming Zhou
Hui Wang
Yong Qin
39
0
0
21 Apr 2025
Multi-modal Speech Transformer Decoders: When Do Multiple Modalities Improve Accuracy?
Yiwen Guan
V. Trinh
Vivek Voleti
Jacob Whitehill
42
1
0
13 Sep 2024
MaLa-ASR: Multimedia-Assisted LLM-Based ASR
Guanrou Yang
Ziyang Ma
Fan Yu
Zhifu Gao
Shiliang Zhang
Xie Chen
AuLLM
44
3
0
09 Jun 2024
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
185
784
0
16 Nov 2016
1