Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.05639
Cited By
Looking Enhances Listening: Recovering Missing Speech Using Images
13 February 2020
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Looking Enhances Listening: Recovering Missing Speech Using Images"
10 / 10 papers shown
Title
VHASR: A Multimodal Speech Recognition System With Vision Hotwords
Jiliang Hu
Zuchao Li
Ping Wang
Haojun Ai
Lefei Zhang
Hai Zhao
62
1
0
01 Oct 2024
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
58
15
0
29 Mar 2023
Multimodal Speech Recognition for Language-Guided Embodied Agents
Allen Chang
Xiaoyuan Zhu
Aarav Monga
Seoho Ahn
Tejas Srinivasan
Jesse Thomason
AuLLM
105
3
0
27 Feb 2023
AVATAR: Unconstrained Audiovisual Speech Recognition
Valentin Gabeur
Paul Hongsuck Seo
Arsha Nagrani
Chen Sun
Alahari Karteek
Cordelia Schmid
72
11
0
15 Jun 2022
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Dan Oneaţă
H. Cucu
51
19
0
27 Apr 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Ngoc Dung Huynh
Mohamed Reda Bouadjenek
Imran Razzak
Kevin Lee
Chetan Arora
Ali Hassani
A. Zaslavsky
AAML
65
6
0
22 Feb 2022
Listen, Look and Deliberate: Visual context-aware speech recognition using pre-trained text-video representations
Shahram Ghorbani
Yashesh Gaur
Yu Shi
Jinyu Li
75
14
0
08 Nov 2020
Multimodal Speech Recognition with Unstructured Audio Masking
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Desmond Elliott
CVBM
48
10
0
16 Oct 2020
Fine-Grained Grounding for Multimodal Speech Recognition
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Desmond Elliott
76
11
0
05 Oct 2020
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
126
361
0
21 Apr 2020
1