Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.10157
Cited By
Leveraging Visemes for Better Visual Speech Representation and Lip Reading
19 July 2023
J. Peymanfard
Vahid Saeedi
Mohammad Reza Mohammadi
Hossein Zeinali
N. Mozayani
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Leveraging Visemes for Better Visual Speech Representation and Lip Reading"
13 / 13 papers shown
Title
Word-level Persian Lipreading Dataset
J. Peymanfard
Ali Lashini
Samin Heydarian
Hossein Zeinali
N. Mozayani
50
5
0
08 Apr 2023
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Bowen Shi
Wei-Ning Hsu
Kushal Lakhotia
Abdel-rahman Mohamed
SSL
91
316
0
05 Jan 2022
Lip reading using external viseme decoding
J. Peymanfard
Mohammad Reza Mohammadi
Hossein Zeinali
N. Mozayani
31
11
0
10 Apr 2021
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
Maja Pantic
113
231
0
12 Feb 2021
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
220
3,131
0
16 May 2020
Audio-visual Recognition of Overlapped speech for the LRS2 dataset
Jianwei Yu
Shi-Xiong Zhang
Jian Wu
Shahram Ghorbani
Bo Wu
Shiyin Kang
Shansong Liu
Xunying Liu
Helen Meng
Dong Yu
71
73
0
06 Jan 2020
Common Voice: A Massively-Multilingual Speech Corpus
Rosana Ardila
Megan Branson
Kelly Davis
Michael Henretty
M. Kohler
Josh Meyer
Reuben Morais
Lindsay Saunders
Francis M. Tyers
Gregor Weber
VLM
91
1,594
0
13 Dec 2019
ASR is all you need: cross-modal distillation for lip reading
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
40
135
0
28 Nov 2019
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture
Stavros Petridis
Themos Stafylakis
Pingchuan Ma
Georgios Tzimiropoulos
Maja Pantic
57
131
0
28 Sep 2018
Deep Audio-Visual Speech Recognition
Triantafyllos Afouras
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
89
703
0
06 Sep 2018
LRS3-TED: a large-scale dataset for visual speech recognition
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
62
441
0
03 Sep 2018
Decoding visemes: improving machine lipreading
Helen L. Bear
R. Harvey
VLM
85
42
0
03 Oct 2017
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
250
789
0
16 Nov 2016
1