Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.08186
Cited By
Fast-Slow Transformer for Visually Grounding Speech
16 September 2021
Puyuan Peng
David F. Harwath
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fast-Slow Transformer for Visually Grounding Speech"
9 / 9 papers shown
Title
Measuring Sound Symbolism in Audio-visual Models
Wei-Cheng Tseng
Yi-Jen Shih
David Harwath
Raymond Mooney
34
0
0
18 Sep 2024
A model of early word acquisition based on realistic-scale audiovisual naming events
Khazar Khorrami
Okko Rasanen
NAI
42
0
0
07 Jun 2024
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System
Khazar Khorrami
María Andrea Cruz Blandón
Tuomas Virtanen
Okko Rasanen
SSL
20
1
0
05 Jun 2023
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
Puyuan Peng
Shang-Wen Li
Okko Rasanen
Abdel-rahman Mohamed
David F. Harwath
SSL
VLM
26
7
0
19 May 2023
Comparative layer-wise analysis of self-supervised speech models
Ankita Pasad
Bowen Shi
Karen Livescu
SSL
24
109
0
08 Nov 2022
Towards visually prompted keyword localisation for zero-resource spoken languages
Leanne Nortje
Herman Kamper
16
6
0
12 Oct 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
Keyword localisation in untranscribed speech using visually grounded speech models
Kayode Olaleye
Dan Oneaţă
Herman Kamper
19
7
0
02 Feb 2022
ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition
Afra Alishahia
Grzegorz Chrupała
Alejandrina Cristià
Emmanuel Dupoux
Bertrand Higy
Marvin Lavechin
Okko Rasanen
Chen Yu
41
7
0
14 Jul 2021
1