ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.04600
  4. Cited By
YFACC: A Yorùbá speech-image dataset for cross-lingual keyword
  localisation through visual grounding
v1v2 (latest)

YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding

10 October 2022
Kayode Olaleye
Dan Oneaţă
Herman Kamper
    ObjD
ArXiv (abs)PDFHTML

Papers citing "YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding"

13 / 13 papers shown
Title
BibleTTS: a large, high-fidelity, multilingual, and uniquely African
  speech corpus
BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus
Josh Meyer
David Ifeoluwa Adelani
Edresson Casanova
A. Oktem
Daniel Whitenack Julian Weber
...
Victor Akinode
Bernard Opoku
S. Olanrewaju
Jesujoba Oluwadara Alabi
Shamsuddeen Hassan Muhammad
36
23
0
07 Jul 2022
Building African Voices
Building African Voices
Perez Ogayo
Graham Neubig
A. Black
120
15
0
01 Jul 2022
Keyword localisation in untranscribed speech using visually grounded
  speech models
Keyword localisation in untranscribed speech using visually grounded speech models
Kayode Olaleye
Dan Oneaţă
Herman Kamper
51
7
0
02 Feb 2022
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded
  Speech
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
David Harwath
Wei-Ning Hsu
James R. Glass
75
84
0
21 Nov 2019
On the Contributions of Visual and Textual Supervision in Low-Resource
  Semantic Speech Retrieval
On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval
Ankita Pasad
Bowen Shi
Herman Kamper
Karen Livescu
36
12
0
24 Apr 2019
End-to-End Automatic Speech Translation of Audiobooks
End-to-End Automatic Speech Translation of Audiobooks
Alexandre Berard
Laurent Besacier
A. Kocabiyikoglu
Olivier Pietquin
112
193
0
12 Feb 2018
Semantic speech retrieval with a visually grounded model of
  untranscribed speech
Semantic speech retrieval with a visually grounded model of untranscribed speech
Herman Kamper
Gregory Shakhnarovich
Karen Livescu
67
53
0
05 Oct 2017
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
Ron J. Weiss
J. Chorowski
Navdeep Jaitly
Yonghui Wu
Zhiwen Chen
79
344
0
24 Mar 2017
Visually grounded learning of keyword prediction from untranscribed
  speech
Visually grounded learning of keyword prediction from untranscribed speech
Herman Kamper
Shane Settle
Gregory Shakhnarovich
Karen Livescu
114
63
0
23 Mar 2017
Representations of language in a model of visually grounded speech
  signal
Representations of language in a model of visually grounded speech signal
Grzegorz Chrupała
Lieke Gelderloos
Afra Alishahi
75
131
0
07 Feb 2017
Learning Word-Like Units from Joint Audio-Visual Analysis
Learning Word-Like Units from Joint Audio-Visual Analysis
David Harwath
James R. Glass
68
106
0
25 Jan 2017
Deep Multimodal Semantic Embeddings for Speech and Images
Deep Multimodal Semantic Embeddings for Speech and Images
David Harwath
James R. Glass
62
157
0
11 Nov 2015
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for
  Richer Image-to-Sentence Models
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
Bryan A. Plummer
Liwei Wang
Christopher M. Cervantes
Juan C. Caicedo
Julia Hockenmaier
Svetlana Lazebnik
202
2,071
0
19 May 2015
1