ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.03206
  4. Cited By
Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep
  Visual Speech Recognition

Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition

6 March 2020
Yuanhang Zhang
Shuang Yang
Jingyun Xiao
Shiguang Shan
Xilin Chen
ArXivPDFHTML

Papers citing "Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition"

27 / 27 papers shown
Title
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer
Young-Hu Park
R.-H. Park
Hyung-Min Park
54
0
0
07 May 2025
Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models
Jing-Xuan Zhang
Genshun Wan
Jianqing Gao
Zhen-Hua Ling
49
0
0
09 Feb 2025
RAL:Redundancy-Aware Lipreading Model Based on Differential Learning
  with Symmetric Views
RAL:Redundancy-Aware Lipreading Model Based on Differential Learning with Symmetric Views
Zejun gu
Junxia jiang
36
0
0
09 Sep 2024
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder
He Wang
Pengcheng Guo
Xucheng Wan
Huan Zhou
Lei Xie
43
2
0
08 Apr 2024
Landmark-Guided Cross-Speaker Lip Reading with Mutual Information
  Regularization
Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization
Linzhi Wu
Xingyu Zhang
Yakun Zhang
Changyan Zheng
Tiejun Liu
Liang Xie
Ye Yan
Erwei Yin
35
1
0
24 Mar 2024
Speaker-Adapted End-to-End Visual Speech Recognition for Continuous
  Spanish
Speaker-Adapted End-to-End Visual Speech Recognition for Continuous Spanish
David Gimeno-Gómez
Carlos David Martínez Hinarejos
31
0
0
21 Nov 2023
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the Wild
David Gimeno-Gómez
Carlos David Martínez Hinarejos
19
8
0
21 Nov 2023
Learning Separable Hidden Unit Contributions for Speaker-Adaptive
  Lip-Reading
Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading
Songtao Luo
Shuang Yang
Shiguang Shan
Xilin Chen
38
1
0
08 Oct 2023
Lip Reading for Low-resource Languages by Learning and Combining General
  Speech Knowledge and Language-specific Knowledge
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
Minsu Kim
Jeong Hun Yeo
J. Choi
Y. Ro
34
16
0
18 Aug 2023
A Survey on Deep Multi-modal Learning for Body Language Recognition and
  Generation
A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Li Liu
Lufei Gao
Wen-Ling Lei
Fengji Ma
Xiaotian Lin
Jin-Tao Wang
CVBM
27
5
0
17 Aug 2023
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by
  Compressing Audio Knowledge of a Pretrained Model
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Jeong Hun Yeo
Minsu Kim
J. Choi
Dae Hoe Kim
Y. Ro
26
18
0
15 Aug 2023
Multi-Temporal Lip-Audio Memory for Visual Speech Recognition
Multi-Temporal Lip-Audio Memory for Visual Speech Recognition
Jeong Hun Yeo
Minsu Kim
Y. Ro
27
11
0
08 May 2023
Multimodality Representation Learning: A Survey on Evolution,
  Pretraining and Its Applications
Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications
Muhammad Arslan Manzoor
S. Albarri
Ziting Xian
Zaiqiao Meng
Preslav Nakov
Shangsong Liang
AI4TS
42
26
0
01 Feb 2023
Training Strategies for Improved Lip-reading
Training Strategies for Improved Lip-reading
Pingchuan Ma
Yujiang Wang
Stavros Petridis
Jie Shen
M. Pantic
28
46
0
03 Sep 2022
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality
  Knowledge Distillation for Word-Based Models
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models
Hadeel Mabrouk
Omar Abugabal
Nourhan Sakr
Hesham M. Eraqi
VLM
33
2
0
05 Jun 2022
Is Lip Region-of-Interest Sufficient for Lipreading?
Is Lip Region-of-Interest Sufficient for Lipreading?
Jing-Xuan Zhang
Genshun Wan
Jia-Yu Pan
24
6
0
28 May 2022
Lip to Speech Synthesis with Visual Context Attentional GAN
Lip to Speech Synthesis with Visual Context Attentional GAN
Minsu Kim
Joanna Hong
Y. Ro
31
51
0
04 Apr 2022
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip
  Reading
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
Minsu Kim
Jeong Hun Yeo
Yong Man Ro
13
61
0
04 Apr 2022
Multi-modality Associative Bridging through Memory: Speech Sound
  Recollected from Face Video
Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video
Minsu Kim
Joanna Hong
Se Jin Park
Yong Man Ro
CVBM
25
40
0
04 Apr 2022
Advances and Challenges in Deep Lip Reading
Advances and Challenges in Deep Lip Reading
Marzieh Oghbaie
Arian Sabaghi
Kooshan Hashemifard
Mohammad Akbari
VLM
30
15
0
15 Oct 2021
Sub-word Level Lip Reading With Visual Attention
Sub-word Level Lip Reading With Visual Attention
Prajwal K R
Triantafyllos Afouras
Andrew Zisserman
17
92
0
14 Oct 2021
Spatio-Temporal Attention Mechanism and Knowledge Distillation for Lip
  Reading
Spatio-Temporal Attention Mechanism and Knowledge Distillation for Lip Reading
Shahd Elashmawy
Marian M. Ramsis
Hesham M. Eraqi
Farah Eldeshnawy
Hadeel Mabrouk
Omar Abugabal
Nourhan Sakr
35
1
0
07 Aug 2021
Seeking the Shape of Sound: An Adaptive Framework for Learning
  Voice-Face Association
Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association
Peisong Wen
Qianqian Xu
Yangbangyan Jiang
Zhiyong Yang
Yuan He
Qingming Huang
CVBM
17
32
0
12 Mar 2021
Learn an Effective Lip Reading Model without Pains
Learn an Effective Lip Reading Model without Pains
Dalu Feng
Shuang Yang
Shiguang Shan
Xilin Chen
30
61
0
15 Nov 2020
Lip-reading with Densely Connected Temporal Convolutional Networks
Lip-reading with Densely Connected Temporal Convolutional Networks
Pingchuan Ma
Yujiang Wang
Jie Shen
Stavros Petridis
M. Pantic
16
56
0
29 Sep 2020
Towards Practical Lipreading with Distilled and Efficient Models
Towards Practical Lipreading with Distilled and Efficient Models
Pingchuan Ma
Brais Martínez
Stavros Petridis
M. Pantic
26
95
0
13 Jul 2020
Synchronous Bidirectional Learning for Multilingual Lip Reading
Synchronous Bidirectional Learning for Multilingual Lip Reading
Mingshuang Luo
Shuang Yang
Xilin Chen
Zitao Liu
Shiguang Shan
28
15
0
08 May 2020
1