Quality Over Quantity? LLM-Based Curation for a Data-Efficient Audio-Video Foundation Model

Papers citing "Quality Over Quantity? LLM-Based Curation for a Data-Efficient Audio-Video Foundation Model"

39 / 39 papers shown
Title
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
Guangzhi Sun
Wenyi Yu
Changli Tang
Xianzhao Chen
Tian Tan
Wei Li
Lu Lu
Zejun Ma
Yuxuan Wang
Chao Zhang
81
30
0
22 Jun 2024
The Sound of Motions
The Sound of Motions
Hang Zhao
Chuang Gan
Wei-Chiu Ma
Antonio Torralba
80
254
0
11 Apr 2019
Audio-Visual Event Localization in Unconstrained Videos
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
99
435
0
23 Mar 2018