ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.02562
  4. Cited By
Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models

Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models

5 July 2021
Khazar Khorrami
Okko Rasanen
ArXivPDFHTML

Papers citing "Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models"

7 / 7 papers shown
Title
A model of early word acquisition based on realistic-scale audiovisual
  naming events
A model of early word acquisition based on realistic-scale audiovisual naming events
Khazar Khorrami
Okko Rasanen
NAI
42
0
0
07 Jun 2024
Leveraging Multilingual Self-Supervised Pretrained Models for
  Sequence-to-Sequence End-to-End Spoken Language Understanding
Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding
Pavel Denisov
Ngoc Thang Vu
29
1
0
09 Oct 2023
Syllable Discovery and Cross-Lingual Generalization in a Visually
  Grounded, Self-Supervised Speech Model
Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
Puyuan Peng
Shang-Wen Li
Okko Rasanen
Abdel-rahman Mohamed
David F. Harwath
SSL
VLM
26
7
0
19 May 2023
Multilingual and Multimodal Topic Modelling with Pretrained Embeddings
Multilingual and Multimodal Topic Modelling with Pretrained Embeddings
Elaine Zosa
Lidia Pivovarova
BDL
11
8
0
15 Nov 2022
Can phones, syllables, and words emerge as side-products of
  cross-situational audiovisual learning? -- A computational investigation
Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? -- A computational investigation
Khazar Khorrami
Okko Rasanen
34
20
0
29 Sep 2021
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,926
0
17 Aug 2015
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
239
31,257
0
16 Jan 2013
1