Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.05358
Cited By
Lip Reading Sentences in the Wild
16 November 2016
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Lip Reading Sentences in the Wild"
50 / 340 papers shown
Title
MusicFace: Music-driven Expressive Singing Face Synthesis
Peng Liu
W. Deng
Hengda Li
Jintai Wang
Yinglin Zheng
Yiwei Ding
Xiaohu Guo
Ming Zeng
CVBM
35
10
0
24 Mar 2023
Learning Cross-lingual Visual Speech Representations
Andreas Zinonos
A. Haliassos
Pingchuan Ma
Stavros Petridis
M. Pantic
SSL
22
8
0
14 Mar 2023
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
21
3
0
09 Mar 2023
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
Xize Cheng
Lin Li
Tao Jin
Rongjie Huang
Wang Lin
Zehan Wang
Huangdai Liu
Yejin Wang
Aoxiong Yin
Zhou Zhao
26
24
0
09 Mar 2023
A Light Weight Model for Active Speaker Detection
Junhua Liao
Haihan Duan
Kanghui Feng
Wanbing Zhao
Yanbing Yang
Liangyin Chen
35
36
0
08 Mar 2023
Visuo-Tactile-Based Slip Detection Using A Multi-Scale Temporal Convolution Network
Junli Gao
Zhaoji Huang
Zhao-Li Tang
Haitao Song
Wenyu Liang
24
4
0
27 Feb 2023
Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video
Minsu Kim
Chae Won Kim
Y. Ro
CVBM
DiffM
38
3
0
27 Feb 2023
Lip-to-Speech Synthesis in the Wild with Multi-task Learning
Minsu Kim
Joanna Hong
Y. Ro
14
21
0
17 Feb 2023
Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition
Minsu Kim
Hyungil Kim
Y. Ro
VLM
13
18
0
16 Feb 2023
LipFormer: Learning to Lipread Unseen Speakers based on Visual-Landmark Transformers
Feng Xue
Yu Li
Deyin Liu
Yincen Xie
Lin Wu
Richang Hong
36
12
0
04 Feb 2023
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset
J. Peymanfard
Samin Heydarian
Ali Lashini
Hossein Zeinali
Mohammad Reza Mohammadi
N. Mozayani
29
10
0
21 Jan 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
26
15
0
19 Jan 2023
OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
J. Park
Jung-Wook Hwang
Kwanghee Choi
Seung-Hyun Lee
Jun-Hwan Ahn
R.-H. Park
Hyung-Min Park
29
3
0
16 Jan 2023
Speech Driven Video Editing via an Audio-Conditioned Diffusion Model
Dan Bigioi
Shubhajit Basak
Michał Stypułkowski
Maciej Ziȩba
H. Jordan
R. Mcdonnell
Peter Corcoran
DiffM
VGen
24
35
0
10 Jan 2023
Audio-Visual Efficient Conformer for Robust Speech Recognition
Maxime Burchi
Radu Timofte
VLM
11
33
0
04 Jan 2023
Jointly Learning Visual and Auditory Speech Representations from Raw Data
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
M. Pantic
SSL
45
48
0
12 Dec 2022
Learning to Dub Movies via Hierarchical Prosody Models
Gaoxiang Cong
Liang Li
Yuankai Qi
Zhengjun Zha
Qi Wu
Wen-yu Wang
Bin Jiang
Ming Yang
Qin Huang
75
25
0
08 Dec 2022
LISA: Localized Image Stylization with Audio via Implicit Neural Representation
Seung Hyun Lee
Chanyoung Kim
Wonmin Byeon
Sang Ho Yoon
Jinkyu Kim
Sangpil Kim
30
3
0
21 Nov 2022
AVATAR submission to the Ego4D AV Transcription Challenge
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
30
0
0
18 Nov 2022
An Investigation of Smart Contract for Collaborative Machine Learning Model Training
Sheng Ding
Chenhui Hu
6
2
0
12 Sep 2022
Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by Human Speech Perception
Jiadong Wang
Xinyuan Qian
Haizhou Li
41
14
0
05 Sep 2022
Training Strategies for Improved Lip-reading
Pingchuan Ma
Yujiang Wang
Stavros Petridis
Jie Shen
M. Pantic
28
46
0
03 Sep 2022
Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild
Sindhu B. Hegde
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
45
10
0
01 Sep 2022
Bayesian Neural Network Language Modeling for Speech Recognition
Boyang Xue
Shoukang Hu
Junhao Xu
Mengzhe Geng
Xunying Liu
Helen M. Meng
UQCV
BDL
44
14
0
28 Aug 2022
Speaker-adaptive Lip Reading with User-dependent Padding
Minsu Kim
Hyunjun Kim
Y. Ro
17
20
0
09 Aug 2022
Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition
Joanna Hong
Minsu Kim
Daehun Yoo
Y. Ro
26
20
0
13 Jul 2022
Dual-Path Cross-Modal Attention for better Audio-Visual Speech Extraction
Zhongweiyang Xu
Xulin Fan
M. Hasegawa-Johnson
19
2
0
09 Jul 2022
Show Me Your Face, And I'll Tell You How You Speak
Christen Millerdurai
L. A. Khaliq
Timon Ulrich
CVBM
68
0
0
28 Jun 2022
Self-Supervised Learning for Videos: A Survey
Madeline Chantry Schiappa
Yogesh S Rawat
M. Shah
SSL
36
131
0
18 Jun 2022
AVATAR: Unconstrained Audiovisual Speech Recognition
Valentin Gabeur
Paul Hongsuck Seo
Arsha Nagrani
Chen Sun
Alahari Karteek
Cordelia Schmid
23
11
0
15 Jun 2022
Face-Dubbing++: Lip-Synchronous, Voice Preserving Translation of Videos
Alexander Waibel
M. Behr
Fevziye Irem Eyiokur
Dogucan Yaman
Tuan-Nam Nguyen
Carlos Mullov
Mehmet Arif Demirtas
Alperen Kantarci
Stefan Constantin
H. K. Ekenel
CVBM
15
14
0
09 Jun 2022
Lip-Listening: Mixing Senses to Understand Lips using Cross Modality Knowledge Distillation for Word-Based Models
Hadeel Mabrouk
Omar Abugabal
Nourhan Sakr
Hesham M. Eraqi
VLM
33
2
0
05 Jun 2022
Learning Speaker-specific Lip-to-Speech Generation
Munender Varshney
Ravindra Yadav
Vinay P. Namboodiri
R. Hegde
21
7
0
04 Jun 2022
HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System
Bao-Sinh Nguyen
Q. Tran
Tuan-Anh Dang Nguyen
D. Nguyen
H. Le
35
0
0
01 Jun 2022
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video Podcasts
Debjoy Saha
Shravan Nayak
Timo Baumann
24
3
0
24 May 2022
Deep Learning for Visual Speech Analysis: A Survey
Changchong Sheng
Gangyao Kuang
L. Bai
Chen Hou
Y. Guo
Xin Xu
M. Pietikäinen
Li Liu
VLM
29
33
0
22 May 2022
End-to-End Multi-Person Audio/Visual Automatic Speech Recognition
Otavio Braga
Takaki Makino
Olivier Siohan
H. Liao
CVBM
14
15
0
11 May 2022
A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Otavio Braga
Olivier Siohan
19
7
0
11 May 2022
Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection
Otavio Braga
Olivier Siohan
CVBM
27
8
0
10 May 2022
Scaling up sign spotting through sign language dictionaries
Gül Varol
Liliane Momeni
Samuel Albanie
Triantafyllos Afouras
Andrew Zisserman
29
14
0
09 May 2022
Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
Minsu Kim
Jeong Hun Yeo
Yong Man Ro
13
61
0
04 Apr 2022
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction
Zexu Pan
Meng Ge
Haizhou Li
21
17
0
31 Mar 2022
Localizing Visual Sounds the Easy Way
Shentong Mo
Pedro Morgado
26
78
0
17 Mar 2022
Visual Speech Recognition for Multiple Languages in the Wild
Pingchuan Ma
Stavros Petridis
M. Pantic
VLM
128
144
0
26 Feb 2022
Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video
Matthew Groh
Aruna Sankaranarayanan
Nikhil Singh
Dong Young Kim
A. Lippman
Rosalind W. Picard
11
17
0
25 Feb 2022
Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition
Xichen Pan
Peiyu Chen
Yichen Gong
Helong Zhou
Xinbing Wang
Zhouhan Lin
SSL
25
34
0
24 Feb 2022
Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition for Single and Multi-Person Video
Dmitriy Serdyuk
Otavio Braga
Olivier Siohan
ViT
91
40
0
25 Jan 2022
Survey on the Convergence of Machine Learning and Blockchain
Sheng Ding
Chenhui Hu
SyDa
10
10
0
04 Jan 2022
DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering
Shunyu Yao
Ruizhe Zhong
Yichao Yan
Guangtao Zhai
Xiaokang Yang
CVBM
24
90
0
03 Jan 2022
Skin feature point tracking using deep feature encodings
J. Chang
Torbjörn E. M. Nordling
20
2
0
28 Dec 2021
Previous
1
2
3
4
5
6
7
Next