v1v2 (latest)

Lip Reading Sentences in the Wild

16 November 2016

Joon Son Chung

Papers citing "Lip Reading Sentences in the Wild"

50 / 344 papers shown

Title
Deep Learning for Deepfakes Creation and Detection: A Survey Thanh Thi Nguyen Quoc Viet Hung Nguyen Dung Nguyen D. Nguyen Thien Huynh-The S. Nahavandi Thanh Tam Nguyen Quoc-Viet Pham Cu Nguyen 109 457 0 25 Sep 2019
Gated Channel Transformation for Visual Recognition Zongxin Yang Linchao Zhu Yu Wu Yezhou Yang ViT 64 212 0 25 Sep 2019
What can computational models learn from human selective attention? A review from an audiovisual crossmodal perspective Di Fu C. Weber Guochun Yang Matthias Kerzel Weizhi Nan Pablo V. A. Barros Haiyan Wu Xun Liu S. Wermter 33 0 0 05 Sep 2019
Multi-Grained Spatio-temporal Modeling for Lip-reading Chenhao Wang 85 52 0 30 Aug 2019
A Cascade Sequence-to-Sequence Model for Chinese Mandarin Lip Reading Ya Zhao Rui Xu Xiuming Zhang 86 64 0 14 Aug 2019
Hybrid-Attention based Decoupled Metric Learning for Zero-Shot Image Retrieval Binghui Chen Weihong Deng VLM FedML 57 56 0 27 Jul 2019
UnsuperPoint: End-to-end Unsupervised Interest Point Detector and Descriptor P. Christiansen M. Kragh Y. Brodskiy H. Karstoft 3DPC 86 87 0 09 Jul 2019
Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions Tejas Srinivasan Ramon Sanabria Florian Metze 47 8 0 30 Jun 2019
Lipper: Synthesizing Thy Speech using Multi-View Lipreading Yaman Kumar Singla Rohit Jain Khwaja Mohd. Salik R. Shah Yifang Yin Roger Zimmermann 109 41 0 28 Jun 2019
LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models D. Margam R. Aralikatti Tanay Sharma Abhinav Thanda K. PujithaA. Sharad Roy S. Venkatesan 30 17 0 25 Jun 2019
Video-Driven Speech Reconstruction using Generative Adversarial Networks Konstantinos Vougioukas Pingchuan Ma Stavros Petridis Maja Pantic GAN 76 49 0 14 Jun 2019
On Single Source Robustness in Deep Fusion Models Taewan Kim Joydeep Ghosh AAML 52 22 0 11 Jun 2019
Investigating the Lombard Effect Influence on End-to-End Audio-Visual Speech Recognition Pingchuan Ma Stavros Petridis Maja Pantic AuLLM 91 10 0 05 Jun 2019
Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain Johanes Effendi Andros Tjandra S. Sakti Satoshi Nakamura 57 3 0 03 Jun 2019
Video-to-Video Translation for Visual Speech Synthesis M. Doukas V. Sharmanska Stefanos Zafeiriou 37 0 0 28 May 2019
Speech2Face: Learning the Face Behind a Voice Tae-Hyun Oh Tali Dekel Changil Kim Inbar Mosseri William T. Freeman Michael Rubinstein Wojciech Matusik SSL CVBM 112 165 0 23 May 2019
MobiVSR: A Visual Speech Recognition Solution for Mobile Devices Nilay Shrivastava Astitwa Saxena Yaman Kumar Singla Preeti Kaur Debanjan Mahata R. Shah 83 3 0 10 May 2019
Learning Spatio-Temporal Features with Two-Stream Deep 3D CNNs for Lipreading Xinshuo Weng Kris Kitani 94 72 0 04 May 2019
Evaluating Recurrent Neural Network Explanations L. Arras Ahmed Osman K. Müller Wojciech Samek XAI FAtt 117 88 0 26 Apr 2019
An Analysis of Speech Enhancement and Recognition Losses in Limited Resources Multi-talker Single Channel Audio-Visual ASR Luca Pasa Giovanni Morrone Leonardo Badino 48 2 0 16 Apr 2019
Synthesising 3D Facial Motion from "In-the-Wild" Speech Panagiotis Tzirakis A. Papaioannou Alexandros Lattas Michail Tarasiou Björn Schuller Stefanos Zafeiriou CVBM 57 13 0 15 Apr 2019
The Sound of Motions Hang Zhao Chuang Gan Wei-Chiu Ma Antonio Torralba 88 254 0 11 Apr 2019
Learning from Videos with Deep Convolutional LSTM Networks Logan Courtney R. Sreenivas 28 7 0 09 Apr 2019
Time Domain Audio Visual Speech Separation Jian Wu Yong-mei Xu Shi-Xiong Zhang Lianwu Chen Meng Yu Lei Xie Dong Yu 124 118 0 07 Apr 2019
End-to-End Visual Speech Recognition for Small-Scale Datasets Stavros Petridis Yujiang Wang Pingchuan Ma Zuwei Li Maja Pantic AI4TS VLM 58 36 0 02 Apr 2019
Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks A. Duarte Francisco Roldan Miquel Tubau Janna Escur Santiago Pascual Amaia Salvador Eva Mohedano Kevin McGuinness Jordi Torres Xavier Giró-i-Nieto GAN CVBM 73 79 0 25 Mar 2019
Dual-modality seq2seq network for audio-visual event localization Yan-Bo Lin Yu-Jhe Li Y. Wang 71 132 0 20 Feb 2019
FocusNet: An attention-based Fully Convolutional Network for Medical Image Segmentation Chaitanya Kaul S. Manandhar Nick E. Pears SSeg 67 130 0 08 Feb 2019
Harnessing GANs for Zero-shot Learning of New Classes in Visual Speech Recognition Yaman Kumar Singla Dhruva Sahrawat Shubham Maheshwari Debanjan Mahata Amanda Stent Yifang Yin R. Shah Roger Zimmermann VLM 38 13 0 29 Jan 2019
FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals U. Ciftci Ilke Demir 89 380 0 08 Jan 2019
An Empirical Analysis of Deep Audio-Visual Models for Speech Recognition Devesh Walawalkar Yihui He R. Pillai 50 1 0 21 Dec 2018
DeepFakes: a New Threat to Face Recognition? Assessment and Detection Pavel Korshunov S´ebastien Marcel PICV CVBM 93 610 0 20 Dec 2018
On Attention Modules for Audio-Visual Synchronization Naji Khosravan Shervin Ardeshir R. Puri 65 21 0 14 Dec 2018
An Attempt towards Interpretable Audio-Visual Video Captioning Yapeng Tian Chenxiao Guan Justin Goodman Marc Moore Chenliang Xu 91 20 0 07 Dec 2018
Privacy Partitioning: Protecting User Data During the Deep Learning Inference Phase Jianfeng Chi Emmanuel Owusu Xuwang Yin Tong Yu William Chan P. Tague Yuan Tian FedML 66 28 0 07 Dec 2018
Modality Attention for End-to-End Audio-visual Speech Recognition Pan Zhou Wenwen Yang Wei Chen Yanfeng Wang Jia Jia 92 69 0 13 Nov 2018
Multimodal Grounding for Sequence-to-Sequence Speech Recognition Ozan Caglayan Ramon Sanabria Shruti Palaskar Loïc Barrault Florian Metze 80 25 0 09 Nov 2018
The speaker-independent lipreading play-off; a survey of lipreading machines Jake Burton David Frank Mahdi Saleh Nassir Navab Helen L. Bear 36 11 0 24 Oct 2018
LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild Shuang Yang Yuanhang Zhang Dalu Feng Mingmin Yang Chenhao Wang Jingyun Xiao Keyu Long Shiguang Shan Xilin Chen 104 151 0 16 Oct 2018
Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture Stavros Petridis Themos Stafylakis Pingchuan Ma Georgios Tzimiropoulos Maja Pantic 77 131 0 28 Sep 2018
Perfect match: Improved cross-modal embeddings for audio-visual synchronisation Soo-Whan Chung Joon Son Chung Hong-Goo Kang 69 117 0 21 Sep 2018
End-to-end Audiovisual Speech Activity Detection with Bimodal Recurrent Neural Models Fei Tao Carlos Busso 81 34 0 12 Sep 2018
Deep Audio-Visual Speech Recognition Triantafyllos Afouras Joon Son Chung A. Senior Oriol Vinyals Andrew Zisserman 109 711 0 06 Sep 2018
Attention-based Audio-Visual Fusion for Robust Automatic Speech Recognition George Sterpu Christian Saam N. Harte 103 65 0 05 Sep 2018
LRS3-TED: a large-scale dataset for visual speech recognition Triantafyllos Afouras Joon Son Chung Andrew Zisserman 73 446 0 03 Sep 2018
Single-Microphone Speech Enhancement and Separation Using Deep Learning Morten Kolbaek 58 7 0 31 Aug 2018
Contextual Audio-Visual Switching For Speech Enhancement in Real-World Environments Ahsan Adeel M. Gogate Amir Hussain 81 52 0 28 Aug 2018
Dynamic Temporal Alignment of Speech to Lips Tavi Halperin Ariel Ephrat Shmuel Peleg 59 40 0 19 Aug 2018
Lip-Reading Driven Deep Learning Approach for Speech Enhancement Ahsan Adeel M. Gogate Amir Hussain W. Whitmer 81 65 0 31 Jul 2018
X2Face: A network for controlling face generation by using images, audio, and pose codes Olivia Wiles A. Sophia Koepke Andrew Zisserman CVBM 98 416 0 27 Jul 2018