ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.08186
  4. Cited By
Fast-Slow Transformer for Visually Grounding Speech

Fast-Slow Transformer for Visually Grounding Speech

16 September 2021
Puyuan Peng
David F. Harwath
ArXivPDFHTML

Papers citing "Fast-Slow Transformer for Visually Grounding Speech"

9 / 9 papers shown
Title
Measuring Sound Symbolism in Audio-visual Models
Measuring Sound Symbolism in Audio-visual Models
Wei-Cheng Tseng
Yi-Jen Shih
David Harwath
Raymond Mooney
34
0
0
18 Sep 2024
A model of early word acquisition based on realistic-scale audiovisual
  naming events
A model of early word acquisition based on realistic-scale audiovisual naming events
Khazar Khorrami
Okko Rasanen
NAI
42
0
0
07 Jun 2024
Simultaneous or Sequential Training? How Speech Representations
  Cooperate in a Multi-Task Self-Supervised Learning System
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System
Khazar Khorrami
María Andrea Cruz Blandón
Tuomas Virtanen
Okko Rasanen
SSL
20
1
0
05 Jun 2023
Syllable Discovery and Cross-Lingual Generalization in a Visually
  Grounded, Self-Supervised Speech Model
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
Puyuan Peng
Shang-Wen Li
Okko Rasanen
Abdel-rahman Mohamed
David F. Harwath
SSL
VLM
26
7
0
19 May 2023
Comparative layer-wise analysis of self-supervised speech models
Comparative layer-wise analysis of self-supervised speech models
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
24
109
0
08 Nov 2022
Towards visually prompted keyword localisation for zero-resource spoken
  languages
Towards visually prompted keyword localisation for zero-resource spoken languages
Leanne Nortje
Herman Kamper
16
6
0
12 Oct 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
Keyword localisation in untranscribed speech using visually grounded
  speech models
Keyword localisation in untranscribed speech using visually grounded speech models
Kayode Olaleye
Dan Oneaţă
Herman Kamper
19
7
0
02 Feb 2022
ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language
  Modelling track, 2021 edition
ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition
Afra Alishahia
Grzegorz Chrupała
Alejandrina Cristià
Emmanuel Dupoux
Bertrand Higy
Marvin Lavechin
Okko Rasanen
Chen Yu
41
7
0
14 Jul 2021
1