ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.14402
  4. Cited By
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for
  Noise-Robust Speech Perception

XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

21 March 2024
HyoJung Han
Mohamed Anwar
J. Pino
Wei-Ning Hsu
Marine Carpuat
Bowen Shi
Changhan Wang
    VLM
ArXivPDFHTML

Papers citing "XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception"

9 / 9 papers shown
Title
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization
CoGenAV: Versatile Audio-Visual Representation Learning via Contrastive-Generative Synchronization
Detao Bai
Zhiheng Ma
Xihan Wei
Liefeng Bo
120
0
0
06 May 2025
mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition
mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition
Andrew Rouditchenko
Saurabhchand Bhati
Samuel Thomas
Hilde Kuehne
Rogerio Feris
116
1
0
03 Feb 2025
SpeechQE: Estimating the Quality of Direct Speech Translation
SpeechQE: Estimating the Quality of Direct Speech Translation
HyoJung Han
Kevin Duh
Marine Carpuat
34
0
0
28 Oct 2024
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
Tailored Design of Audio-Visual Speech Recognition Models using Branchformers
David Gimeno-Gómez
Carlos David Martínez Hinarejos
91
2
0
09 Jul 2024
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by
  Combining Regression and Improved Contrastive Learning
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning
Qiu-shi Zhu
Long Zhou
Jie Zhang
Shujie Liu
Yu-Chen Hu
Lirong Dai
VLM
SSL
60
37
0
27 Oct 2022
FLEURS: Few-shot Learning Evaluation of Universal Representations of
  Speech
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Alexis Conneau
Min Ma
Simran Khanuja
Yu Zhang
Vera Axelrod
Siddharth Dalmia
Jason Riesa
Clara E. Rivera
Ankur Bapna
VLM
83
282
0
25 May 2022
End-to-end Audio-visual Speech Recognition with Conformers
End-to-end Audio-visual Speech Recognition with Conformers
Pingchuan Ma
Stavros Petridis
M. Pantic
84
225
0
12 Feb 2021
Lipreading using Temporal Convolutional Networks
Lipreading using Temporal Convolutional Networks
Brais Martínez
Pingchuan Ma
Stavros Petridis
M. Pantic
168
239
0
23 Jan 2020
Lip Reading Sentences in the Wild
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
164
784
0
16 Nov 2016
1