ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.05358
  4. Cited By
Lip Reading Sentences in the Wild
v1v2 (latest)

Lip Reading Sentences in the Wild

16 November 2016
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Lip Reading Sentences in the Wild"

50 / 344 papers shown
Title
Deep Learning for Deepfakes Creation and Detection: A Survey
Deep Learning for Deepfakes Creation and Detection: A Survey
Thanh Thi Nguyen
Quoc Viet Hung Nguyen
Dung Nguyen
D. Nguyen
Thien Huynh-The
S. Nahavandi
Thanh Tam Nguyen
Quoc-Viet Pham
Cu Nguyen
109
457
0
25 Sep 2019
Gated Channel Transformation for Visual Recognition
Gated Channel Transformation for Visual Recognition
Zongxin Yang
Linchao Zhu
Yu Wu
Yezhou Yang
ViT
64
212
0
25 Sep 2019
What can computational models learn from human selective attention? A
  review from an audiovisual crossmodal perspective
What can computational models learn from human selective attention? A review from an audiovisual crossmodal perspective
Di Fu
C. Weber
Guochun Yang
Matthias Kerzel
Weizhi Nan
Pablo V. A. Barros
Haiyan Wu
Xun Liu
S. Wermter
33
0
0
05 Sep 2019
Multi-Grained Spatio-temporal Modeling for Lip-reading
Multi-Grained Spatio-temporal Modeling for Lip-reading
Chenhao Wang
85
52
0
30 Aug 2019
A Cascade Sequence-to-Sequence Model for Chinese Mandarin Lip Reading
A Cascade Sequence-to-Sequence Model for Chinese Mandarin Lip Reading
Ya Zhao
Rui Xu
Xiuming Zhang
86
64
0
14 Aug 2019
Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image
  Retrieval
Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image Retrieval
Binghui Chen
Weihong Deng
VLMFedML
57
56
0
27 Jul 2019
UnsuperPoint: End-to-end Unsupervised Interest Point Detector and
  Descriptor
UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor
P. Christiansen
M. Kragh
Y. Brodskiy
H. Karstoft
3DPC
86
87
0
09 Jul 2019
Analyzing Utility of Visual Context in Multimodal Speech Recognition
  Under Noisy Conditions
Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions
Tejas Srinivasan
Ramon Sanabria
Florian Metze
47
8
0
30 Jun 2019
Lipper: Synthesizing Thy Speech using Multi-View Lipreading
Lipper: Synthesizing Thy Speech using Multi-View Lipreading
Yaman Kumar Singla
Rohit Jain
Khwaja Mohd. Salik
R. Shah
Yifang Yin
Roger Zimmermann
109
41
0
28 Jun 2019
LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models
LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models
D. Margam
R. Aralikatti
Tanay Sharma
Abhinav Thanda
K. PujithaA.
Sharad Roy
S. Venkatesan
30
17
0
25 Jun 2019
Video-Driven Speech Reconstruction using Generative Adversarial Networks
Video-Driven Speech Reconstruction using Generative Adversarial Networks
Konstantinos Vougioukas
Pingchuan Ma
Stavros Petridis
Maja Pantic
GAN
76
49
0
14 Jun 2019
On Single Source Robustness in Deep Fusion Models
On Single Source Robustness in Deep Fusion Models
Taewan Kim
Joydeep Ghosh
AAML
52
22
0
11 Jun 2019
Investigating the Lombard Effect Influence on End-to-End Audio-Visual
  Speech Recognition
Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition
Pingchuan Ma
Stavros Petridis
Maja Pantic
AuLLM
91
10
0
05 Jun 2019
Listening while Speaking and Visualizing: Improving ASR through
  Multimodal Chain
Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain
Johanes Effendi
Andros Tjandra
S. Sakti
Satoshi Nakamura
57
3
0
03 Jun 2019
Video-to-Video Translation for Visual Speech Synthesis
Video-to-Video Translation for Visual Speech Synthesis
M. Doukas
V. Sharmanska
Stefanos Zafeiriou
37
0
0
28 May 2019
Speech2Face: Learning the Face Behind a Voice
Speech2Face: Learning the Face Behind a Voice
Tae-Hyun Oh
Tali Dekel
Changil Kim
Inbar Mosseri
William T. Freeman
Michael Rubinstein
Wojciech Matusik
SSLCVBM
112
165
0
23 May 2019
MobiVSR: A Visual Speech Recognition Solution for Mobile Devices
MobiVSR: A Visual Speech Recognition Solution for Mobile Devices
Nilay Shrivastava
Astitwa Saxena
Yaman Kumar Singla
Preeti Kaur
Debanjan Mahata
R. Shah
83
3
0
10 May 2019
Learning Spatio-Temporal Features with Two-Stream Deep 3D CNNs for
  Lipreading
Learning Spatio-Temporal Features with Two-Stream Deep 3D CNNs for Lipreading
Xinshuo Weng
Kris Kitani
94
72
0
04 May 2019
Evaluating Recurrent Neural Network Explanations
Evaluating Recurrent Neural Network Explanations
L. Arras
Ahmed Osman
K. Müller
Wojciech Samek
XAIFAtt
117
88
0
26 Apr 2019
An Analysis of Speech Enhancement and Recognition Losses in Limited
  Resources Multi-talker Single Channel Audio-Visual ASR
An Analysis of Speech Enhancement and Recognition Losses in Limited Resources Multi-talker Single Channel Audio-Visual ASR
Luca Pasa
Giovanni Morrone
Leonardo Badino
48
2
0
16 Apr 2019
Synthesising 3D Facial Motion from "In-the-Wild" Speech
Synthesising 3D Facial Motion from "In-the-Wild" Speech
Panagiotis Tzirakis
A. Papaioannou
Alexandros Lattas
Michail Tarasiou
Björn Schuller
Stefanos Zafeiriou
CVBM
57
13
0
15 Apr 2019
The Sound of Motions
The Sound of Motions
Hang Zhao
Chuang Gan
Wei-Chiu Ma
Antonio Torralba
88
254
0
11 Apr 2019
Learning from Videos with Deep Convolutional LSTM Networks
Learning from Videos with Deep Convolutional LSTM Networks
Logan Courtney
R. Sreenivas
28
7
0
09 Apr 2019
Time Domain Audio Visual Speech Separation
Time Domain Audio Visual Speech Separation
Jian Wu
Yong-mei Xu
Shi-Xiong Zhang
Lianwu Chen
Meng Yu
Lei Xie
Dong Yu
124
118
0
07 Apr 2019
End-to-End Visual Speech Recognition for Small-Scale Datasets
End-to-End Visual Speech Recognition for Small-Scale Datasets
Stavros Petridis
Yujiang Wang
Pingchuan Ma
Zuwei Li
Maja Pantic
AI4TSVLM
58
36
0
02 Apr 2019
Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial
  Networks
Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks
A. Duarte
Francisco Roldan
Miquel Tubau
Janna Escur
Santiago Pascual
Amaia Salvador
Eva Mohedano
Kevin McGuinness
Jordi Torres
Xavier Giró-i-Nieto
GANCVBM
73
79
0
25 Mar 2019
Dual-modality seq2seq network for audio-visual event localization
Dual-modality seq2seq network for audio-visual event localization
Yan-Bo Lin
Yu-Jhe Li
Y. Wang
71
132
0
20 Feb 2019
FocusNet: An attention-based Fully Convolutional Network for Medical
  Image Segmentation
FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation
Chaitanya Kaul
S. Manandhar
Nick E. Pears
SSeg
67
130
0
08 Feb 2019
Harnessing GANs for Zero-shot Learning of New Classes in Visual Speech
  Recognition
Harnessing GANs for Zero-shot Learning of New Classes in Visual Speech Recognition
Yaman Kumar Singla
Dhruva Sahrawat
Shubham Maheshwari
Debanjan Mahata
Amanda Stent
Yifang Yin
R. Shah
Roger Zimmermann
VLM
38
13
0
29 Jan 2019
FakeCatcher: Detection of Synthetic Portrait Videos using Biological
  Signals
FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals
U. Ciftci
Ilke Demir
89
380
0
08 Jan 2019
An Empirical Analysis of Deep Audio-Visual Models for Speech Recognition
An Empirical Analysis of Deep Audio-Visual Models for Speech Recognition
Devesh Walawalkar
Yihui He
R. Pillai
50
1
0
21 Dec 2018
DeepFakes: a New Threat to Face Recognition? Assessment and Detection
DeepFakes: a New Threat to Face Recognition? Assessment and Detection
Pavel Korshunov
S´ebastien Marcel
PICVCVBM
93
610
0
20 Dec 2018
On Attention Modules for Audio-Visual Synchronization
On Attention Modules for Audio-Visual Synchronization
Naji Khosravan
Shervin Ardeshir
R. Puri
65
21
0
14 Dec 2018
An Attempt towards Interpretable Audio-Visual Video Captioning
An Attempt towards Interpretable Audio-Visual Video Captioning
Yapeng Tian
Chenxiao Guan
Justin Goodman
Marc Moore
Chenliang Xu
91
20
0
07 Dec 2018
Privacy Partitioning: Protecting User Data During the Deep Learning
  Inference Phase
Privacy Partitioning: Protecting User Data During the Deep Learning Inference Phase
Jianfeng Chi
Emmanuel Owusu
Xuwang Yin
Tong Yu
William Chan
P. Tague
Yuan Tian
FedML
66
28
0
07 Dec 2018
Modality Attention for End-to-End Audio-visual Speech Recognition
Modality Attention for End-to-End Audio-visual Speech Recognition
Pan Zhou
Wenwen Yang
Wei Chen
Yanfeng Wang
Jia Jia
92
69
0
13 Nov 2018
Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Ozan Caglayan
Ramon Sanabria
Shruti Palaskar
Loïc Barrault
Florian Metze
80
25
0
09 Nov 2018
The speaker-independent lipreading play-off; a survey of lipreading
  machines
The speaker-independent lipreading play-off; a survey of lipreading machines
Jake Burton
David Frank
Mahdi Saleh
Nassir Navab
Helen L. Bear
36
11
0
24 Oct 2018
LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading
  in the Wild
LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild
Shuang Yang
Yuanhang Zhang
Dalu Feng
Mingmin Yang
Chenhao Wang
Jingyun Xiao
Keyu Long
Shiguang Shan
Xilin Chen
104
151
0
16 Oct 2018
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture
Stavros Petridis
Themos Stafylakis
Pingchuan Ma
Georgios Tzimiropoulos
Maja Pantic
77
131
0
28 Sep 2018
Perfect match: Improved cross-modal embeddings for audio-visual
  synchronisation
Perfect match: Improved cross-modal embeddings for audio-visual synchronisation
Soo-Whan Chung
Joon Son Chung
Hong-Goo Kang
69
117
0
21 Sep 2018
End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent
  Neural Models
End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent Neural Models
Fei Tao
Carlos Busso
81
34
0
12 Sep 2018
Deep Audio-Visual Speech Recognition
Deep Audio-Visual Speech Recognition
Triantafyllos Afouras
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
109
711
0
06 Sep 2018
Attention-based Audio-Visual Fusion for Robust Automatic Speech
  Recognition
Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition
George Sterpu
Christian Saam
N. Harte
103
65
0
05 Sep 2018
LRS3-TED: a large-scale dataset for visual speech recognition
LRS3-TED: a large-scale dataset for visual speech recognition
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
73
446
0
03 Sep 2018
Single-Microphone Speech Enhancement and Separation Using Deep Learning
Single-Microphone Speech Enhancement and Separation Using Deep Learning
Morten Kolbaek
58
7
0
31 Aug 2018
Contextual Audio-Visual Switching For Speech Enhancement in Real-World
  Environments
Contextual Audio-Visual Switching For Speech Enhancement in Real-World Environments
Ahsan Adeel
M. Gogate
Amir Hussain
81
52
0
28 Aug 2018
Dynamic Temporal Alignment of Speech to Lips
Dynamic Temporal Alignment of Speech to Lips
Tavi Halperin
Ariel Ephrat
Shmuel Peleg
59
40
0
19 Aug 2018
Lip-Reading Driven Deep Learning Approach for Speech Enhancement
Lip-Reading Driven Deep Learning Approach for Speech Enhancement
Ahsan Adeel
M. Gogate
Amir Hussain
W. Whitmer
81
65
0
31 Jul 2018
X2Face: A network for controlling face generation by using images,
  audio, and pose codes
X2Face: A network for controlling face generation by using images, audio, and pose codes
Olivia Wiles
A. Sophia Koepke
Andrew Zisserman
CVBM
98
416
0
27 Jul 2018
Previous
1234567
Next