Multimodal Grounding for Sequence-to-Sequence Speech Recognition

9 November 2018

Papers citing "Multimodal Grounding for Sequence-to-Sequence Speech Recognition"

7 / 7 papers shown

Title
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR Paul Hongsuck Seo Arsha Nagrani Cordelia Schmid 29 15 0 29 Mar 2023
Multimodal Speech Recognition for Language-Guided Embodied Agents Allen Chang Xiaoyuan Zhu Aarav Monga Seoho Ahn Tejas Srinivasan Jesse Thomason AuLLM 24 3 0 27 Feb 2023
Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations Dan Oneaţă H. Cucu 19 19 0 27 Apr 2022
Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey Ngoc Dung Huynh Mohamed Reda Bouadjenek Imran Razzak Kevin Lee Chetan Arora Ali Hassani A. Zaslavsky AAML 34 6 0 22 Feb 2022
Fine-Grained Grounding for Multimodal Speech Recognition Tejas Srinivasan Ramon Sanabria Florian Metze Desmond Elliott 23 11 0 05 Oct 2020
Multiresolution and Multimodal Speech Recognition with Transformers Georgios Paraskevopoulos Srinivas Parthasarathy Aparna Khare Shiva Sundaram 25 29 0 29 Apr 2020
Lip Reading Sentences in the Wild Joon Son Chung A. Senior Oriol Vinyals Andrew Zisserman 185 784 0 16 Nov 2016