Leveraging Topics and Audio Features with Multimodal Attention for Audio Visual Scene-Aware Dialog

20 December 2019

Papers citing "Leveraging Topics and Audio Features with Multimodal Attention for Audio Visual Scene-Aware Dialog"

3 / 3 papers shown

Title
STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering Yueqian Wang Yuxuan Wang Kai Chen Dongyan Zhao 33 2 0 08 Jan 2024
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks Hung Le Nancy F. Chen Guosheng Lin MLLM 26 19 0 16 Apr 2021
Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues Hung Le Nancy F. Chen Guosheng Lin 36 14 0 01 Mar 2021