Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events

23 February 2021

Papers citing "Text-to-Audio Grounding: Building Correspondence Between Captions and Sound Events"

6 / 6 papers shown

Title
FLAM: Frame-Wise Language-Audio Modeling Yusong Wu Christos Tsirigotis Ke Chen Cheng-Zhi Anna Huang Rameswar Panda Oriol Nieto Prem Seetharaman Justin Salamon 50 0 0 08 May 2025
Audio-Language Datasets of Scenes and Events: A Survey Gijs Wijngaard Elia Formisano Michele Esposito M. Dumontier 81 2 0 10 Jan 2025
Large Language Models are Few-Shot Health Learners Xin Liu Daniel J. McDuff G. Kovács I. Galatzer-Levy Jacob Sunshine Jiening Zhan M. Poh Shun Liao P. Achille Shwetak N. Patel LM&MA AI4MH 39 103 0 24 May 2023
Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss Andrew Koh Chng Eng Siong 26 1 0 29 Jun 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning Xuenan Xu Zeyu Xie Mengyue Wu K. Yu 34 13 0 11 May 2022
A Framework for the Robust Evaluation of Sound Event Detection Cagdas Bilen Giacomo Ferroni Francesco Tuveri Juan Azcarreta Sacha Krstulović 43 163 0 18 Oct 2019