ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.08571
  4. Cited By
Audio-visual Multi-channel Recognition of Overlapped Speech

Audio-visual Multi-channel Recognition of Overlapped Speech

18 May 2020
Jianwei Yu
Bo Wu
R. Yu
Shi-Xiong Zhang
Lianwu Chen
Yong Xu. Meng Yu
Dan Su
Dong Yu
Xunying Liu
Helen Meng
ArXivPDFHTML

Papers citing "Audio-visual Multi-channel Recognition of Overlapped Speech"

18 / 18 papers shown
Title
Neural Spatio-Temporal Beamformer for Target Speech Separation
Neural Spatio-Temporal Beamformer for Target Speech Separation
Yong-mei Xu
Meng Yu
Shi-Xiong Zhang
Lianwu Chen
Chao Weng
Jianming Liu
Dong Yu
38
41
0
08 May 2020
End-to-End Multi-speaker Speech Recognition with Transformer
End-to-End Multi-speaker Speech Recognition with Transformer
Xuankai Chang
Wangyou Zhang
Y. Qian
Jonathan Le Roux
Shinji Watanabe
ViT
58
104
0
10 Feb 2020
Audio-visual Recognition of Overlapped speech for the LRS2 dataset
Audio-visual Recognition of Overlapped speech for the LRS2 dataset
Jianwei Yu
Shi-Xiong Zhang
Jian Wu
Shahram Ghorbani
Bo Wu
Shiyin Kang
Shansong Liu
Xunying Liu
Helen Meng
Dong Yu
66
73
0
06 Jan 2020
End-to-end training of time domain audio separation and recognition
End-to-end training of time domain audio separation and recognition
Thilo von Neumann
K. Kinoshita
Lukas Drude
Christoph Boeddeker
Marc Delcroix
Tomohiro Nakatani
Reinhold Haeb-Umbach
52
34
0
18 Dec 2019
Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Takaki Makino
H. Liao
Yannis Assael
Brendan Shillingford
Basi García
Otavio Braga
Olivier Siohan
56
129
0
08 Nov 2019
End-to-end Microphone Permutation and Number Invariant Multi-channel
  Speech Separation
End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation
Yi Luo
Zhuo Chen
N. Mesgarani
Takuya Yoshioka
49
180
0
30 Oct 2019
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition
Xuankai Chang
Wangyou Zhang
Y. Qian
Jonathan Le Roux
Shinji Watanabe
52
117
0
15 Oct 2019
FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio
  Processing
FaSNet: Low-latency Adaptive Beamforming for Multi-microphone Audio Processing
Yi Luo
Enea Ceolini
Cong Han
Shih-Chii Liu
N. Mesgarani
49
150
0
29 Sep 2019
My lips are concealed: Audio-visual speech enhancement through
  obstructions
My lips are concealed: Audio-visual speech enhancement through obstructions
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
58
91
0
11 Jul 2019
Speaker-Targeted Audio-Visual Models for Speech Recognition in
  Cocktail-Party Environments
Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Guan-Lin Chao
William Chan
Ian Lane
31
13
0
13 Jun 2019
A comprehensive study of speech separation: spectrogram vs waveform
  separation
A comprehensive study of speech separation: spectrogram vs waveform separation
F. Bahmaninezhad
Jian Wu
Rongzhi Gu
Shi-Xiong Zhang
Yong-mei Xu
Meng Yu
Dong Yu
60
81
0
17 May 2019
Time Domain Audio Visual Speech Separation
Time Domain Audio Visual Speech Separation
Jian Wu
Yong-mei Xu
Shi-Xiong Zhang
Lianwu Chen
Meng Yu
Lei Xie
Dong Yu
60
117
0
07 Apr 2019
Recognizing Overlapped Speech in Meetings: A Multichannel Separation
  Approach Using Neural Networks
Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks
Takuya Yoshioka
Hakan Erdogan
Zhuo Chen
Xiong Xiao
F. Alleva
BDL
51
81
0
08 Oct 2018
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for
  Speech Separation
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Yi Luo
N. Mesgarani
144
1,783
0
20 Sep 2018
Deep Audio-Visual Speech Recognition
Deep Audio-Visual Speech Recognition
Triantafyllos Afouras
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
67
701
0
06 Sep 2018
Talking Face Generation by Adversarially Disentangled Audio-Visual
  Representation
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation
Hang Zhou
Yu Liu
Ziwei Liu
Ping Luo
Xiaogang Wang
CVBM
84
441
0
20 Jul 2018
Deep Lip Reading: a comparison of models and an online application
Deep Lip Reading: a comparison of models and an online application
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
56
118
0
15 Jun 2018
The Conversation: Deep Audio-Visual Speech Enhancement
The Conversation: Deep Audio-Visual Speech Enhancement
Triantafyllos Afouras
Joon Son Chung
Andrew Zisserman
72
360
0
11 Apr 2018
1