ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.03543
  4. Cited By
Self-Supervised Representation Learning for Speech Using Visual
  Grounding and Masked Language Modeling

Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling

7 February 2022
Puyuan Peng
David Harwath
    SSL
ArXivPDFHTML

Papers citing "Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling"

13 / 13 papers shown
Title
A model of early word acquisition based on realistic-scale audiovisual
  naming events
A model of early word acquisition based on realistic-scale audiovisual naming events
Khazar Khorrami
Okko Rasanen
NAI
45
0
0
07 Jun 2024
A Large-Scale Evaluation of Speech Foundation Models
A Large-Scale Evaluation of Speech Foundation Models
Shu-Wen Yang
Heng-Jui Chang
Zili Huang
Andy T. Liu
Cheng-I Jeff Lai
...
Kushal Lakhotia
Shang-Wen Li
Abdelrahman Mohamed
Shinji Watanabe
Hung-yi Lee
38
19
0
15 Apr 2024
Simultaneous or Sequential Training? How Speech Representations
  Cooperate in a Multi-Task Self-Supervised Learning System
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System
Khazar Khorrami
María Andrea Cruz Blandón
Tuomas Virtanen
Okko Rasanen
SSL
27
1
0
05 Jun 2023
Syllable Discovery and Cross-Lingual Generalization in a Visually
  Grounded, Self-Supervised Speech Model
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
Puyuan Peng
Shang-Wen Li
Okko Rasanen
Abdel-rahman Mohamed
David Harwath
SSL
VLM
26
7
0
19 May 2023
Comparative layer-wise analysis of self-supervised speech models
Comparative layer-wise analysis of self-supervised speech models
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
33
109
0
08 Nov 2022
Self-supervised language learning from raw audio: Lessons from the Zero
  Resource Speech Challenge
Self-supervised language learning from raw audio: Lessons from the Zero Resource Speech Challenge
Ewan Dunbar
Nicolas Hamilakis
Emmanuel Dupoux
SSL
32
30
0
27 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of
  Self-Supervised Speech Representation Learning
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
28
33
0
16 Oct 2022
AudioLM: a Language Modeling Approach to Audio Generation
AudioLM: a Language Modeling Approach to Audio Generation
Zalan Borsos
Raphaël Marinier
Damien Vincent
Eugene Kharitonov
Olivier Pietquin
...
Dominik Roblek
O. Teboul
David Grangier
Marco Tagliasacchi
Neil Zeghidour
AuLLM
49
567
0
07 Sep 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language
  Modelling track, 2021 edition
ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition
Afra Alishahia
Grzegorz Chrupała
Alejandrina Cristià
Emmanuel Dupoux
Bertrand Higy
Marvin Lavechin
Okko Rasanen
Chen Yu
41
7
0
14 Jul 2021
Speech Representation Learning Combining Conformer CPC with Deep Cluster
  for the ZeroSpeech Challenge 2021
Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021
Takashi Maekaku
Xuankai Chang
Yuya Fujita
Li-Wei Chen
Shinji Watanabe
Alexander I. Rudnicky
109
13
0
13 Jul 2021
Exploring wav2vec 2.0 on speaker verification and language
  identification
Exploring wav2vec 2.0 on speaker verification and language identification
Zhiyun Fan
Meng Li
Shiyu Zhou
Bo Xu
117
202
0
11 Dec 2020
Multi-task self-supervised learning for Robust Speech Recognition
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
189
288
0
25 Jan 2020
1