ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.00944
  4. Cited By
Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional
  Neural Networks

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks

1 September 2017
Jen-Cheng Hou
Syu-Siang Wang
Ying-Hui Lai
Yu Tsao
Hsiu-Wen Chang
H. Wang
ArXivPDFHTML

Papers citing "Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks"

30 / 30 papers shown
Title
Diffusion-based Unsupervised Audio-visual Speech Enhancement
Diffusion-based Unsupervised Audio-visual Speech Enhancement
Jean-Eudes Ayilo
Mostafa Sadeghi
Romain Serizel
Xavier Alameda-Pineda
DiffM
22
0
0
04 Oct 2024
Audio-visual video-to-speech synthesis with synthesized input audio
Audio-visual video-to-speech synthesis with synthesized input audio
Triantafyllos Kefalas
Yannis Panagakis
M. Pantic
VGen
DiffM
38
1
0
31 Jul 2023
Incorporating Ultrasound Tongue Images for Audio-Visual Speech
  Enhancement through Knowledge Distillation
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation
Ruixin Zheng
Yang Ai
Zhenhua Ling
32
8
0
24 May 2023
Learning in Audio-visual Context: A Review, Analysis, and New
  Perspective
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
46
55
0
20 Aug 2022
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated
  Open-Domain On-Screen Sound Separation
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Efthymios Tzinis
Scott Wisdom
Tal Remez
J. Hershey
39
30
0
20 Jul 2022
Improving Visual Speech Enhancement Network by Learning Audio-visual
  Affinity with Multi-head Attention
Improving Visual Speech Enhancement Network by Learning Audio-visual Affinity with Multi-head Attention
Xinmeng Xu
Yang Wang
Jie Jia
Binbin Chen
Dejun Li
21
9
0
30 Jun 2022
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Changan Chen
Carl Schissler
Sanchit Garg
Philip Kobernik
Alexander Clegg
P. Calamia
Dhruv Batra
Philip Robinson
Kristen Grauman
3DGS
36
80
0
16 Jun 2022
EPG2S: Speech Generation and Speech Enhancement based on
  Electropalatography and Audio Signals using Multimodal Learning
EPG2S: Speech Generation and Speech Enhancement based on Electropalatography and Audio Signals using Multimodal Learning
Lichin Chen
Po-Hsun Chen
Richard Tzong-Han Tsai
Yu Tsao
14
8
0
16 Jun 2022
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
Juan F. Montesinos
V. S. Kadandale
G. Haro
ViT
23
19
0
08 Mar 2022
SpeechPainter: Text-conditioned Speech Inpainting
SpeechPainter: Text-conditioned Speech Inpainting
Zalan Borsos
Matthew Sharifi
Marco Tagliasacchi
16
25
0
15 Feb 2022
Visual Acoustic Matching
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
21
56
0
14 Feb 2022
A Novel Temporal Attentive-Pooling based Convolutional Recurrent
  Architecture for Acoustic Signal Enhancement
A Novel Temporal Attentive-Pooling based Convolutional Recurrent Architecture for Acoustic Signal Enhancement
Tassadaq Hussain
Wei-Chien Wang
M. Gogate
K. Dashtipour
Yu Tsao
Xugang Lu
A. Ahsan
Amir Hussain
21
3
0
24 Jan 2022
Towards Robust Real-time Audio-Visual Speech Enhancement
Towards Robust Real-time Audio-Visual Speech Enhancement
M. Gogate
K. Dashtipour
Amir Hussain
29
3
0
16 Dec 2021
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection
FaVoA: Face-Voice Association Favours Ambiguous Speaker Detection
Hugo C. C. Carneiro
C. Weber
S. Wermter
CVBM
31
7
0
01 Sep 2021
Look Who's Talking: Active Speaker Detection in the Wild
Look Who's Talking: Active Speaker Detection in the Wild
You Jin Kim
Hee-Soo Heo
Soyeon Choe
Soo-Whan Chung
Yoohwan Kwon
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
44
20
0
17 Aug 2021
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Shota Orihashi
Ryo Masumura
30
8
0
02 Mar 2021
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of
  On-Screen Sounds
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds
Efthymios Tzinis
Scott Wisdom
A. Jansen
Shawn Hershey
Tal Remez
D. Ellis
J. Hershey
39
69
0
02 Nov 2020
Listening to Sounds of Silence for Speech Denoising
Listening to Sounds of Silence for Speech Denoising
Ruilin Xu
Rundi Wu
Y. Ishiwaka
Carl Vondrick
Changxi Zheng
25
32
0
22 Oct 2020
Correlating Subword Articulation with Lip Shapes for Embedding Aware
  Audio-Visual Speech Enhancement
Correlating Subword Articulation with Lip Shapes for Embedding Aware Audio-Visual Speech Enhancement
Hang Chen
Jun Du
Yu Hu
Lirong Dai
Baocai Yin
Chin-Hui Lee
31
19
0
21 Sep 2020
Lite Audio-Visual Speech Enhancement
Lite Audio-Visual Speech Enhancement
Shang-Yi Chuang
Yu Tsao
Chen-Chou Lo
Hsin-Min Wang
16
24
0
24 May 2020
Discriminative Multi-modality Speech Recognition
Discriminative Multi-modality Speech Recognition
Bo Xu
Cheng Lu
Yandong Guo
Jacob Wang
20
98
0
12 May 2020
Time-Domain Multi-modal Bone/air Conducted Speech Enhancement
Time-Domain Multi-modal Bone/air Conducted Speech Enhancement
Cheng Yu
Kuo-Hsuan Hung
Syu-Siang Wang
Szu-Wei Fu
Yu Tsao
J. Hung
26
33
0
22 Nov 2019
MMTM: Multimodal Transfer Module for CNN Fusion
MMTM: Multimodal Transfer Module for CNN Fusion
Hamid Reza Vaezi Joze
Amirreza Shaban
Michael L. Iuzzolino
K. Koishida
18
277
0
20 Nov 2019
CochleaNet: A Robust Language-independent Audio-Visual Model for Speech
  Enhancement
CochleaNet: A Robust Language-independent Audio-Visual Model for Speech Enhancement
M. Gogate
K. Dashtipour
Ahsan Adeel
Amir Hussain
23
53
0
23 Sep 2019
Audio-visual Speech Enhancement Using Conditional Variational
  Auto-Encoders
Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoders
M. Sadeghi
Simon Leglaive
Xavier Alameda-Pineda
Laurent Girin
Radu Horaud
DiffM
19
65
0
07 Aug 2019
Role of Awareness and Universal Context in a Spiking Conscious Neural
  Network (SCNN): A New Perspective and Future Directions
Role of Awareness and Universal Context in a Spiking Conscious Neural Network (SCNN): A New Perspective and Future Directions
Ahsan Adeel
18
0
0
05 Nov 2018
Lip-Reading Driven Deep Learning Approach for Speech Enhancement
Lip-Reading Driven Deep Learning Approach for Speech Enhancement
Ahsan Adeel
M. Gogate
Amir Hussain
W. Whitmer
19
62
0
31 Jul 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
Andrew Owens
Alexei A. Efros
SSL
51
745
0
10 Apr 2018
The History Began from AlexNet: A Comprehensive Survey on Deep Learning
  Approaches
The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches
Md. Zahangir Alom
T. Taha
C. Yakopcic
Stefan Westberg
P. Sidike
Mst Shamima Nasrin
B. Van Essen
A. Awwal
V. Asari
VLM
29
873
0
03 Mar 2018
Lip Reading Sentences in the Wild
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
176
784
0
16 Nov 2016
1