Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.03206
Cited By
Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
6 March 2020
Yuanhang Zhang
Shuang Yang
Jingyun Xiao
Shiguang Shan
Xilin Chen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition"
27 / 27 papers shown
Title
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer
Young-Hu Park
R.-H. Park
Hyung-Min Park
54
0
0
07 May 2025
Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models
Jing-Xuan Zhang
Genshun Wan
Jianqing Gao
Zhen-Hua Ling
49
0
0
09 Feb 2025
RAL:Redundancy-Aware Lipreading Model Based on Differential Learning with Symmetric Views
Zejun gu
Junxia jiang
36
0
0
09 Sep 2024
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder
He Wang
Pengcheng Guo
Xucheng Wan
Huan Zhou
Lei Xie
43
2
0
08 Apr 2024
Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization
Linzhi Wu
Xingyu Zhang
Yakun Zhang
Changyan Zheng
Tiejun Liu
Liang Xie
Ye Yan
Erwei Yin
35
1
0
24 Mar 2024
Speaker-Adapted End-to-End Visual Speech Recognition for Continuous Spanish
David Gimeno-Gómez
Carlos David Martínez Hinarejos
31
0
0
21 Nov 2023
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild
David Gimeno-Gómez
Carlos David Martínez Hinarejos
19
8
0
21 Nov 2023
Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading
Songtao Luo
Shuang Yang
Shiguang Shan
Xilin Chen
38
1
0
08 Oct 2023
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
Minsu Kim
Jeong Hun Yeo
J. Choi
Y. Ro
34
16
0
18 Aug 2023
A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Li Liu
Lufei Gao
Wen-Ling Lei
Fengji Ma
Xiaotian Lin
Jin-Tao Wang
CVBM
27
5
0
17 Aug 2023
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Jeong Hun Yeo
Minsu Kim
J. Choi
Dae Hoe Kim
Y. Ro
26
18
0
15 Aug 2023
Multi-Temporal Lip-Audio Memory for Visual Speech Recognition
Jeong Hun Yeo
Minsu Kim
Y. Ro
27
11
0
08 May 2023
Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications
Muhammad Arslan Manzoor
S. Albarri
Ziting Xian
Zaiqiao Meng
Preslav Nakov
Shangsong Liang
AI4TS
42
26
0
01 Feb 2023
Training Strategies for Improved Lip-reading
Pingchuan Ma
Yujiang Wang
Stavros Petridis
Jie Shen
M. Pantic
28
46
0
03 Sep 2022
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models
Hadeel Mabrouk
Omar Abugabal
Nourhan Sakr
Hesham M. Eraqi
VLM
33
2
0
05 Jun 2022
Is Lip Region-of-Interest Sufficient for Lipreading?
Jing-Xuan Zhang
Genshun Wan
Jia-Yu Pan
24
6
0
28 May 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
31
51
0
04 Apr 2022
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
Minsu Kim
Jeong Hun Yeo
Yong Man Ro
13
61
0
04 Apr 2022
Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video
Minsu Kim
Joanna Hong
Se Jin Park
Yong Man Ro
CVBM
25
40
0
04 Apr 2022
Advances and Challenges in Deep Lip Reading
Marzieh Oghbaie
Arian Sabaghi
Kooshan Hashemifard
Mohammad Akbari
VLM
30
15
0
15 Oct 2021
Sub-word Level Lip Reading With Visual Attention
Prajwal K R
Triantafyllos Afouras
Andrew Zisserman
17
92
0
14 Oct 2021
Spatio-Temporal Attention Mechanism and Knowledge Distillation for Lip Reading
Shahd Elashmawy
Marian M. Ramsis
Hesham M. Eraqi
Farah Eldeshnawy
Hadeel Mabrouk
Omar Abugabal
Nourhan Sakr
35
1
0
07 Aug 2021
Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association
Peisong Wen
Qianqian Xu
Yangbangyan Jiang
Zhiyong Yang
Yuan He
Qingming Huang
CVBM
17
32
0
12 Mar 2021
Learn an Effective Lip Reading Model without Pains
Dalu Feng
Shuang Yang
Shiguang Shan
Xilin Chen
30
61
0
15 Nov 2020
Lip-reading with Densely Connected Temporal Convolutional Networks
Pingchuan Ma
Yujiang Wang
Jie Shen
Stavros Petridis
M. Pantic
16
56
0
29 Sep 2020
Towards Practical Lipreading with Distilled and Efficient Models
Pingchuan Ma
Brais Martínez
Stavros Petridis
M. Pantic
26
95
0
13 Jul 2020
Synchronous Bidirectional Learning for Multilingual Lip Reading
Mingshuang Luo
Shuang Yang
Xilin Chen
Zitao Liu
Shiguang Shan
28
15
0
08 May 2020
1