ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.05358
  4. Cited By
Lip Reading Sentences in the Wild

Lip Reading Sentences in the Wild

16 November 2016
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
ArXivPDFHTML

Papers citing "Lip Reading Sentences in the Wild"

50 / 340 papers shown
Title
A Cascade Sequence-to-Sequence Model for Chinese Mandarin Lip Reading
A Cascade Sequence-to-Sequence Model for Chinese Mandarin Lip Reading
Ya Zhao
Rui Xu
Xiuming Zhang
16
62
0
14 Aug 2019
Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image
  Retrieval
Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image Retrieval
Binghui Chen
Weihong Deng
VLM
FedML
26
55
0
27 Jul 2019
UnsuperPoint: End-to-end Unsupervised Interest Point Detector and
  Descriptor
UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor
P. Christiansen
M. Kragh
Y. Brodskiy
H. Karstoft
3DPC
20
86
0
09 Jul 2019
Analyzing Utility of Visual Context in Multimodal Speech Recognition
  Under Noisy Conditions
Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions
Tejas Srinivasan
Ramon Sanabria
Florian Metze
6
8
0
30 Jun 2019
Lipper: Synthesizing Thy Speech using Multi-View Lipreading
Lipper: Synthesizing Thy Speech using Multi-View Lipreading
Yaman Kumar Singla
Rohit Jain
Khwaja Mohd. Salik
R. Shah
Yifang Yin
Roger Zimmermann
56
39
0
28 Jun 2019
LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models
LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models
D. Margam
R. Aralikatti
Tanay Sharma
Abhinav Thanda
K. PujithaA.
Sharad Roy
S. Venkatesan
18
17
0
25 Jun 2019
Video-Driven Speech Reconstruction using Generative Adversarial Networks
Video-Driven Speech Reconstruction using Generative Adversarial Networks
Konstantinos Vougioukas
Pingchuan Ma
Stavros Petridis
M. Pantic
GAN
22
49
0
14 Jun 2019
On Single Source Robustness in Deep Fusion Models
On Single Source Robustness in Deep Fusion Models
Taewan Kim
Joydeep Ghosh
AAML
16
22
0
11 Jun 2019
Investigating the Lombard Effect Influence on End-to-End Audio-Visual
  Speech Recognition
Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition
Pingchuan Ma
Stavros Petridis
M. Pantic
AuLLM
33
10
0
05 Jun 2019
Listening while Speaking and Visualizing: Improving ASR through
  Multimodal Chain
Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain
Johanes Effendi
Andros Tjandra
S. Sakti
Satoshi Nakamura
21
3
0
03 Jun 2019
Video-to-Video Translation for Visual Speech Synthesis
Video-to-Video Translation for Visual Speech Synthesis
M. Doukas
V. Sharmanska
S. Zafeiriou
26
0
0
28 May 2019
Speech2Face: Learning the Face Behind a Voice
Speech2Face: Learning the Face Behind a Voice
Tae-Hyun Oh
Tali Dekel
Changil Kim
Inbar Mosseri
William T. Freeman
Michael Rubinstein
Wojciech Matusik
SSL
CVBM
27
163
0
23 May 2019
MobiVSR: A Visual Speech Recognition Solution for Mobile Devices
MobiVSR: A Visual Speech Recognition Solution for Mobile Devices
Nilay Shrivastava
Astitwa Saxena
Yaman Kumar Singla
Preeti Kaur
Debanjan Mahata
R. Shah
27
3
0
10 May 2019
Learning Spatio-Temporal Features with Two-Stream Deep 3D CNNs for
  Lipreading
Learning Spatio-Temporal Features with Two-Stream Deep 3D CNNs for Lipreading
Xinshuo Weng
Kris Kitani
19
71
0
04 May 2019
Evaluating Recurrent Neural Network Explanations
Evaluating Recurrent Neural Network Explanations
L. Arras
Ahmed Osman
K. Müller
Wojciech Samek
XAI
FAtt
24
88
0
26 Apr 2019
An Analysis of Speech Enhancement and Recognition Losses in Limited
  Resources Multi-talker Single Channel Audio-Visual ASR
An Analysis of Speech Enhancement and Recognition Losses in Limited Resources Multi-talker Single Channel Audio-Visual ASR
Luca Pasa
Giovanni Morrone
Leonardo Badino
6
2
0
16 Apr 2019
Synthesising 3D Facial Motion from "In-the-Wild" Speech
Synthesising 3D Facial Motion from "In-the-Wild" Speech
Panagiotis Tzirakis
A. Papaioannou
Alexandros Lattas
Michail Tarasiou
Björn Schuller
S. Zafeiriou
CVBM
18
13
0
15 Apr 2019
The Sound of Motions
The Sound of Motions
Hang Zhao
Chuang Gan
Wei-Chiu Ma
Antonio Torralba
17
251
0
11 Apr 2019
Learning from Videos with Deep Convolutional LSTM Networks
Learning from Videos with Deep Convolutional LSTM Networks
Logan Courtney
R. Sreenivas
11
7
0
09 Apr 2019
Time Domain Audio Visual Speech Separation
Time Domain Audio Visual Speech Separation
Jian Wu
Yong-mei Xu
Shi-Xiong Zhang
Lianwu Chen
Meng Yu
Lei Xie
Dong Yu
25
114
0
07 Apr 2019
End-to-End Visual Speech Recognition for Small-Scale Datasets
End-to-End Visual Speech Recognition for Small-Scale Datasets
Stavros Petridis
Yujiang Wang
Pingchuan Ma
Zuwei Li
M. Pantic
AI4TS
VLM
14
35
0
02 Apr 2019
Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial
  Networks
Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks
A. Duarte
Francisco Roldan
Miquel Tubau
Janna Escur
Santiago Pascual
Amaia Salvador
Eva Mohedano
Kevin McGuinness
Jordi Torres
Xavier Giró-i-Nieto
GAN
CVBM
33
79
0
25 Mar 2019
Dual-modality seq2seq network for audio-visual event localization
Dual-modality seq2seq network for audio-visual event localization
Yan-Bo Lin
Yu-Jhe Li
Y. Wang
19
127
0
20 Feb 2019
FocusNet: An attention-based Fully Convolutional Network for Medical
  Image Segmentation
FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation
Chaitanya Kaul
S. Manandhar
Nick E. Pears
SSeg
27
129
0
08 Feb 2019
Harnessing GANs for Zero-shot Learning of New Classes in Visual Speech
  Recognition
Harnessing GANs for Zero-shot Learning of New Classes in Visual Speech Recognition
Yaman Kumar Singla
Dhruva Sahrawat
Shubham Maheshwari
Debanjan Mahata
Amanda Stent
Yifang Yin
R. Shah
Roger Zimmermann
VLM
13
12
0
29 Jan 2019
FakeCatcher: Detection of Synthetic Portrait Videos using Biological
  Signals
FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals
U. Ciftci
Ilke Demir
45
373
0
08 Jan 2019
An Empirical Analysis of Deep Audio-Visual Models for Speech Recognition
An Empirical Analysis of Deep Audio-Visual Models for Speech Recognition
Devesh Walawalkar
Yihui He
R. Pillai
28
1
0
21 Dec 2018
DeepFakes: a New Threat to Face Recognition? Assessment and Detection
DeepFakes: a New Threat to Face Recognition? Assessment and Detection
Pavel Korshunov
S´ebastien Marcel
PICV
CVBM
36
592
0
20 Dec 2018
On Attention Modules for Audio-Visual Synchronization
On Attention Modules for Audio-Visual Synchronization
Naji Khosravan
Shervin Ardeshir
R. Puri
11
21
0
14 Dec 2018
An Attempt towards Interpretable Audio-Visual Video Captioning
An Attempt towards Interpretable Audio-Visual Video Captioning
Yapeng Tian
Chenxiao Guan
Justin Goodman
Marc Moore
Chenliang Xu
36
20
0
07 Dec 2018
Privacy Partitioning: Protecting User Data During the Deep Learning
  Inference Phase
Privacy Partitioning: Protecting User Data During the Deep Learning Inference Phase
Jianfeng Chi
Emmanuel Owusu
Xuwang Yin
Tong Yu
William Chan
P. Tague
Yuan Tian
FedML
19
28
0
07 Dec 2018
Modality Attention for End-to-End Audio-visual Speech Recognition
Modality Attention for End-to-End Audio-visual Speech Recognition
Pan Zhou
Wenwen Yang
Wei Chen
Yanfeng Wang
Jia Jia
24
69
0
13 Nov 2018
Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Ozan Caglayan
Ramon Sanabria
Shruti Palaskar
Loïc Barrault
Florian Metze
26
25
0
09 Nov 2018
The speaker-independent lipreading play-off; a survey of lipreading
  machines
The speaker-independent lipreading play-off; a survey of lipreading machines
Jake Burton
David Frank
Mahdi Saleh
Nassir Navab
Helen L. Bear
6
11
0
24 Oct 2018
LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading
  in the Wild
LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild
Shuang Yang
Yuanhang Zhang
Dalu Feng
Mingmin Yang
Chenhao Wang
Jingyun Xiao
Keyu Long
Shiguang Shan
Xilin Chen
14
150
0
16 Oct 2018
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture
Stavros Petridis
Themos Stafylakis
Pingchuan Ma
Georgios Tzimiropoulos
M. Pantic
14
128
0
28 Sep 2018
Perfect match: Improved cross-modal embeddings for audio-visual
  synchronisation
Perfect match: Improved cross-modal embeddings for audio-visual synchronisation
Soo-Whan Chung
Joon Son Chung
Hong-Goo Kang
6
117
0
21 Sep 2018
End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent
  Neural Models
End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent Neural Models
Fei Tao
Carlos Busso
24
34
0
12 Sep 2018
Deep Audio-Visual Speech Recognition
Deep Audio-Visual Speech Recognition
Triantafyllos Afouras
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
27
687
0
06 Sep 2018
Attention-based Audio-Visual Fusion for Robust Automatic Speech
  Recognition
Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition
George Sterpu
Christian Saam
N. Harte
39
65
0
05 Sep 2018
LRS3-TED: a large-scale dataset for visual speech recognition
LRS3-TED: a large-scale dataset for visual speech recognition
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
14
425
0
03 Sep 2018
Single-Microphone Speech Enhancement and Separation Using Deep Learning
Single-Microphone Speech Enhancement and Separation Using Deep Learning
Morten Kolbaek
20
7
0
31 Aug 2018
Contextual Audio-Visual Switching For Speech Enhancement in Real-World
  Environments
Contextual Audio-Visual Switching For Speech Enhancement in Real-World Environments
Ahsan Adeel
M. Gogate
Amir Hussain
9
52
0
28 Aug 2018
Dynamic Temporal Alignment of Speech to Lips
Dynamic Temporal Alignment of Speech to Lips
Tavi Halperin
Ariel Ephrat
Shmuel Peleg
11
40
0
19 Aug 2018
Lip-Reading Driven Deep Learning Approach for Speech Enhancement
Lip-Reading Driven Deep Learning Approach for Speech Enhancement
Ahsan Adeel
M. Gogate
Amir Hussain
W. Whitmer
21
62
0
31 Jul 2018
X2Face: A network for controlling face generation by using images,
  audio, and pose codes
X2Face: A network for controlling face generation by using images, audio, and pose codes
Olivia Wiles
A. Sophia Koepke
Andrew Zisserman
CVBM
30
410
0
27 Jul 2018
Zero-shot keyword spotting for visual speech recognition in-the-wild
Zero-shot keyword spotting for visual speech recognition in-the-wild
Themos Stafylakis
Georgios Tzimiropoulos
32
38
0
23 Jul 2018
Talking Face Generation by Adversarially Disentangled Audio-Visual
  Representation
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation
Hang Zhou
Yu Liu
Ziwei Liu
Ping Luo
Xiaogang Wang
CVBM
31
436
0
20 Jul 2018
Large-Scale Visual Speech Recognition
Large-Scale Visual Speech Recognition
Brendan Shillingford
Yannis Assael
Matthew W. Hoffman
T. Paine
Cían Hughes
...
Marie Mulville
Ben Coppin
Ben Laurie
A. Senior
Nando de Freitas
29
152
0
13 Jul 2018
Deep Lip Reading: a comparison of models and an online application
Deep Lip Reading: a comparison of models and an online application
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
29
118
0
15 Jun 2018
Previous
1234567
Next