Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.00986
Cited By
v1
v2 (latest)
Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language
3 January 2025
Jeong Hun Yeo
Chae Won Kim
Hyunjun Kim
Hyeongseop Rha
Seunghee Han
Wen-Huang Cheng
Y. Ro
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language"
31 / 31 papers shown
Title
VALLR: Visual ASR Language Model for Lip Reading
Marshall Thomas
Edward Fish
Richard Bowden
97
0
0
27 Mar 2025
MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens
Jeong Hun Yeo
Hyeongseop Rha
Se Jin Park
Y. Ro
135
0
0
14 Mar 2025
Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization
Linzhi Wu
Xingyu Zhang
Yakun Zhang
Changyan Zheng
Tiejun Liu
Liang Xie
Ye Yan
Erwei Yin
69
1
0
24 Mar 2024
Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing
Jeong Hun Yeo
Seunghee Han
Minsu Kim
Y. Ro
121
15
0
23 Feb 2024
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation
Minsu Kim
Jeong Hun Yeo
Se Jin Park
J. Choi
Y. Ro
108
6
0
18 Jan 2024
Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading
Songtao Luo
Shuang Yang
Shiguang Shan
Xilin Chen
123
2
0
08 Oct 2023
Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper
Jeong Hun Yeo
Minsu Kim
Shinji Watanabe
Y. Ro
VLM
101
12
0
15 Sep 2023
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
Minsu Kim
Jeong Hun Yeo
J. Choi
Y. Ro
79
17
0
18 Aug 2023
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Jeong Hun Yeo
Minsu Kim
J. Choi
Dae Hoe Kim
Y. Ro
64
19
0
15 Aug 2023
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels
Pingchuan Ma
A. Haliassos
Adriana Fernandez-Lopez
Honglie Chen
Stavros Petridis
Maja Pantic
109
115
0
25 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.8K
13,560
0
27 Feb 2023
Robust Speech Recognition via Large-Scale Weak Supervision
Alec Radford
Jong Wook Kim
Tao Xu
Greg Brockman
C. McLeavey
Ilya Sutskever
OffRL
262
3,780
0
06 Dec 2022
Speaker-adaptive Lip Reading with User-dependent Padding
Minsu Kim
Hyunjun Kim
Y. Ro
84
21
0
09 Aug 2022
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
Minsu Kim
Jeong Hun Yeo
Yong Man Ro
95
64
0
04 Apr 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Bowen Shi
Wei-Ning Hsu
Kushal Lakhotia
Abdel-rahman Mohamed
SSL
143
321
0
05 Jan 2022
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
928
10,661
0
17 Jun 2021
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
Maja Pantic
172
234
0
12 Feb 2021
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
1.3K
42,754
0
28 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
290
3,179
0
16 May 2020
Deformation Flow Based Two-Stream Network for Lip Reading
Jingyun Xiao
Shuang Yang
Yuanhang Zhang
Shiguang Shan
Xilin Chen
69
64
0
12 Mar 2020
Speaker Adaptive Training using Model Agnostic Meta-Learning
Ondˇrej Klejch
Joachim Fainberg
P. Bell
Steve Renals
101
30
0
23 Oct 2019
Learning Spatio-Temporal Features with Two-Stream Deep 3D CNNs for Lipreading
Xinshuo Weng
Kris Kitani
121
72
0
04 May 2019
LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild
Shuang Yang
Yuanhang Zhang
Dalu Feng
Mingmin Yang
Chenhao Wang
Jingyun Xiao
Keyu Long
Shiguang Shan
Xilin Chen
141
151
0
16 Oct 2018
Deep Audio-Visual Speech Recognition
Triantafyllos Afouras
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
140
711
0
06 Sep 2018
LRS3-TED: a large-scale dataset for visual speech recognition
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
102
446
0
03 Sep 2018
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
390
2,290
0
14 Jun 2018
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
1.2K
133,599
0
12 Jun 2017
Combining Residual Networks with LSTMs for Lipreading
Themos Stafylakis
Georgios Tzimiropoulos
VLM
156
310
0
12 Mar 2017
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
356
796
0
16 Nov 2016
LipNet: End-to-End Sentence-level Lipreading
Yannis Assael
Brendan Shillingford
Shimon Whiteson
Nando de Freitas
118
399
0
05 Nov 2016
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
627
20,640
0
10 Sep 2014
1