ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.08435
  4. Cited By
SPEECH-COCO: 600k Visually Grounded Spoken Captions Aligned to MSCOCO
  Data Set

SPEECH-COCO: 600k Visually Grounded Spoken Captions Aligned to MSCOCO Data Set

26 July 2017
William N. Havard
Laurent Besacier
O. Rosec
ArXivPDFHTML

Papers citing "SPEECH-COCO: 600k Visually Grounded Spoken Captions Aligned to MSCOCO Data Set"

3 / 3 papers shown
Title
Fine-Grained Grounding for Multimodal Speech Recognition
Fine-Grained Grounding for Multimodal Speech Recognition
Tejas Srinivasan
Ramon Sanabria
Florian Metze
Desmond Elliott
23
11
0
05 Oct 2020
AVLnet: Learning Audio-Visual Language Representations from
  Instructional Videos
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Brian Chen
D. Joshi
...
Rogerio Feris
Brian Kingsbury
M. Picheny
Antonio Torralba
James R. Glass
SSL
22
141
0
16 Jun 2020
Captioning Images Taken by People Who Are Blind
Captioning Images Taken by People Who Are Blind
Danna Gurari
Yinan Zhao
Meng Zhang
Nilavra Bhattacharya
22
181
0
20 Feb 2020
1