A SOUND APPROACH: Using Large Language Models to generate audio
descriptions for egocentric text-audio retrieval

A SOUND APPROACH: Using Large Language Models to generate audio descriptions for egocentric text-audio retrieval

29 February 2024

Andreea-Maria Oncescu

João F. Henriques

Andrew Zisserman

A. Sophia Koepke

Papers citing "A SOUND APPROACH: Using Large Language Models to generate audio descriptions for egocentric text-audio retrieval"

5 / 5 papers shown

Title
Audio-Language Datasets of Scenes and Events: A Survey Gijs Wijngaard Elia Formisano Michele Esposito M. Dumontier 81 2 0 10 Jan 2025
Epic-Sounds: A Large-scale Dataset of Actions That Sound Jaesung Huh Jacob Chalk Evangelos Kazakos Dima Damen Andrew Zisserman EgoV 18 41 0 01 Feb 2023
Learning State-Aware Visual Representations from Audible Interactions Himangi Mittal Pedro Morgado Unnat Jain Abhinav Gupta 75 22 0 27 Sep 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection Ke Chen Xingjian Du Bilei Zhu Zejun Ma Taylor Berg-Kirkpatrick Shlomo Dubnov ViT 118 264 0 02 Feb 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video Kristen Grauman Andrew Westbury Eugene Byrne Zachary Chavis Antonino Furnari ... Mike Zheng Shou Antonio Torralba Lorenzo Torresani Mingfei Yan Jitendra Malik EgoV 229 1,019 0 13 Oct 2021