MAiVAR-T: Multimodal Audio-image and Video Action Recognizer using
  Transformers

MAiVAR-T: Multimodal Audio-image and Video Action Recognizer using Transformers

Papers citing "MAiVAR-T: Multimodal Audio-image and Video Action Recognizer using Transformers"

17 / 17 papers shown
Title
Audio-Visual Event Localization in Unconstrained Videos
Audio-Visual Event Localization in Unconstrained Videos
Yapeng Tian
Jing Shi
Bochen Li
Zhiyao Duan
Chenliang Xu
101
439
0
23 Mar 2018

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.