Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1809.00496
Cited By
LRS3-TED: a large-scale dataset for visual speech recognition
3 September 2018
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LRS3-TED: a large-scale dataset for visual speech recognition"
50 / 110 papers shown
Title
WASD: A Wilder Active Speaker Detection Dataset
Tiago Roxo
Joana Cabral Costa
Pedro R. M. Inácio
Hugo Manuel Proença
24
3
0
09 Mar 2023
Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Meng Liu
Kong Aik Lee
Longbiao Wang
Hanyi Zhang
Chang Zeng
J. Dang
23
10
0
22 Feb 2023
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
Zhenhui Ye
Ziyue Jiang
Yi Ren
Jinglin Liu
Jinzheng He
Zhou Zhao
CVBM
39
123
0
31 Jan 2023
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset
J. Peymanfard
Samin Heydarian
Ali Lashini
Hossein Zeinali
Mohammad Reza Mohammadi
N. Mozayani
32
10
0
21 Jan 2023
OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
J. Park
Jung-Wook Hwang
Kwanghee Choi
Seung-Hyun Lee
Jun-Hwan Ahn
R.-H. Park
Hyung-Min Park
29
3
0
16 Jan 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement
Wei-Ning Hsu
Tal Remez
Bowen Shi
Jacob Donley
Yossi Adi
DiffM
27
12
0
21 Dec 2022
Jointly Learning Visual and Auditory Speech Representations from Raw Data
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
Maja Pantic
SSL
45
49
0
12 Dec 2022
Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Chen Chen
Yuchen Hu
Qiang Zhang
Heqing Zou
Beier Zhu
Eng Siong Chng
33
26
0
10 Dec 2022
Streaming Audio-Visual Speech Recognition with Alignment Regularization
Pingchuan Ma
Niko Moritz
Stavros Petridis
Christian Fuegen
Maja Pantic
37
2
0
03 Nov 2022
SS-VAERR: Self-Supervised Apparent Emotional Reaction Recognition from Video
Marija Jegorova
Stavros Petridis
Maja Pantic
29
2
0
20 Oct 2022
Relaxed Attention for Transformer Models
Timo Lohrenz
Björn Möller
Zhengyang Li
Tim Fingscheidt
KELM
29
11
0
20 Sep 2022
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
46
55
0
20 Aug 2022
Visual Speech-Aware Perceptual 3D Facial Expression Reconstruction from Videos
P. Filntisis
George Retsinas
Foivos Paraperas-Papantoniou
Athanasios Katsamanis
A. Roussos
Petros Maragos
3DH
26
29
0
22 Jul 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality
Wei-Ning Hsu
Bowen Shi
SSL
VLM
29
42
0
14 Jul 2022
Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition
Joanna Hong
Minsu Kim
Daehun Yoo
Y. Ro
26
21
0
13 Jul 2022
Cut Inner Layers: A Structured Pruning Strategy for Efficient U-Net GANs
Bo-Kyeong Kim
Shinkook Choi
Hancheol Park
21
4
0
29 Jun 2022
Learning Speaker-specific Lip-to-Speech Generation
Munender Varshney
Ravindra Yadav
Vinay P. Namboodiri
R. Hegde
29
7
0
04 Jun 2022
Is Lip Region-of-Interest Sufficient for Lipreading?
Jing-Xuan Zhang
Genshun Wan
Jia Pan
24
6
0
28 May 2022
A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection
Otavio Braga
Olivier Siohan
27
7
0
11 May 2022
Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection
Otavio Braga
Olivier Siohan
CVBM
35
8
0
10 May 2022
Scaling up sign spotting through sign language dictionaries
Gül Varol
Liliane Momeni
Samuel Albanie
Triantafyllos Afouras
Andrew Zisserman
29
14
0
09 May 2022
Attention-Based Lip Audio-Visual Synthesis for Talking Face Generation in the Wild
Gang Wang
Peng Zhang
Lei Xie
Wei Huang
Yufei Zha
CVBM
27
14
0
08 Mar 2022
Visual Speech Recognition for Multiple Languages in the Wild
Pingchuan Ma
Stavros Petridis
Maja Pantic
VLM
130
145
0
26 Feb 2022
Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Zitian Zhang
Jie Zhang
Jian-Shu Zhang
Ming Wu
Xin Fang
Lirong Dai
SSL
41
10
0
15 Feb 2022
Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition for Single and Multi-Person Video
Dmitriy Serdyuk
Otavio Braga
Olivier Siohan
ViT
96
40
0
25 Jan 2022
CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition
Wenliang Dai
Samuel Cahyawijaya
Tiezheng Yu
Elham J. Barezi
Peng Xu
...
Genta Indra Winata
Qifeng Chen
Xiaojuan Ma
Bertram E. Shi
Pascale Fung
41
11
0
11 Jan 2022
Robust Self-Supervised Audio-Visual Speech Recognition
Bowen Shi
Wei-Ning Hsu
Abdel-rahman Mohamed
39
90
0
05 Jan 2022
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Bowen Shi
Wei-Ning Hsu
Kushal Lakhotia
Abdel-rahman Mohamed
SSL
55
306
0
05 Jan 2022
Responsive Listening Head Generation: A Benchmark Dataset and Baseline
Mohan Zhou
Yalong Bai
Wei Zhang
Ting Yao
Tiejun Zhao
Tao Mei
EGVM
30
45
0
27 Dec 2021
Deep Spoken Keyword Spotting: An Overview
Iván López-Espejo
Zheng-Hua Tan
John H. L. Hansen
Jesper Jensen
21
102
0
20 Nov 2021
Imitating Arbitrary Talking Style for Realistic Audio-DrivenTalking Face Synthesis
Haozhe Wu
Jia Jia
Haoyu Wang
Yishun Dou
Chao Duan
Qingshan Deng
CVBM
11
73
0
30 Oct 2021
Visual Keyword Spotting with Attention
Prajwal K R
Liliane Momeni
Triantafyllos Afouras
Andrew Zisserman
19
13
0
29 Oct 2021
Sub-word Level Lip Reading With Visual Attention
Prajwal K R
Triantafyllos Afouras
Andrew Zisserman
17
92
0
14 Oct 2021
The VVAD-LRS3 Dataset for Visual Voice Activity Detection
Adrian Lubitz
Matias Valdenegro-Toro
Frank Kirchner
26
3
0
28 Sep 2021
The Right to Talk: An Audio-Visual Transformer Approach
Thanh-Dat Truong
C. Duong
T. D. Vu
H. Pham
Bhiksha Raj
Ngan Le
Khoa Luu
63
36
0
06 Aug 2021
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection
Ruijie Tao
Zexu Pan
Rohan Kumar Das
Xinyuan Qian
Mike Zheng Shou
Haizhou Li
22
176
0
14 Jul 2021
LiRA: Learning Visual Speech Representations from Audio through Self-supervision
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
Björn W. Schuller
Maja Pantic
SSL
24
53
0
16 Jun 2021
Fusing information streams in end-to-end audio-visual speech recognition
Wentao Yu
Steffen Zeiler
D. Kolossa
81
12
0
19 Apr 2021
Exploring Deep Learning for Joint Audio-Visual Lip Biometrics
Meng Liu
Longbiao Wang
Kong Aik Lee
Hanyi Zhang
Chang Zeng
J. Dang
HAI
30
12
0
17 Apr 2021
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
Ryo Masumura
30
8
0
02 Mar 2021
Visual Speech Enhancement Without A Real Visual Stream
Sindhu B. Hegde
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
DiffM
20
17
0
20 Dec 2020
TaL: a synchronised multi-speaker corpus of ultrasound tongue imaging, audio, and lip videos
M. Ribeiro
Jennifer Sanger
Jingxuan Zhang
Aciel Eshky
A. Wrench
Korin Richmond
Steve Renals
LM&MA
24
33
0
19 Nov 2020
Video Generative Adversarial Networks: A Review
Nuha Aldausari
Arcot Sowmya
Nadine Marcus
Gelareh Mohammadi
EGVM
21
103
0
04 Nov 2020
Watch, read and lookup: learning to spot signs from multiple supervisors
Liliane Momeni
Gül Varol
Samuel Albanie
Triantafyllos Afouras
Andrew Zisserman
26
43
0
08 Oct 2020
Seeing wake words: Audio-visual Keyword Spotting
Liliane Momeni
Triantafyllos Afouras
Themos Stafylakis
Samuel Albanie
Andrew Zisserman
46
43
0
02 Sep 2020
A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
EGVM
52
759
0
23 Aug 2020
Self-Supervised Learning of Audio-Visual Objects from Video
Triantafyllos Afouras
Andrew Owens
Joon Son Chung
Andrew Zisserman
SSL
19
253
0
10 Aug 2020
Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition
L. Wei
Jie Zhang
Junfeng Hou
Lirong Dai
16
14
0
06 Aug 2020
"Notic My Speech" -- Blending Speech Patterns With Multimedia
Dhruva Sahrawat
Yaman Kumar Singla
Shashwat Aggarwal
Yifang Yin
R. Shah
Roger Zimmermann
33
3
0
12 Jun 2020
Multimodal Target Speech Separation with Voice and Face References
Leyuan Qu
C. Weber
S. Wermter
CVBM
19
19
0
17 May 2020
Previous
1
2
3
Next