LocoMotion: Learning Motion-Focused Video-Language Representations

v1v2 (latest)

LocoMotion: Learning Motion-Focused Video-Language Representations

15 October 2024

Fida Mohammad Thoker

Cees G. M. Snoek

ArXiv (abs)PDF HTML

Papers citing "LocoMotion: Learning Motion-Focused Video-Language Representations"

8 / 58 papers shown

Title
Fine-grained Activity Recognition in Baseball Videos A. Piergiovanni Michael S. Ryoo 72 75 0 09 Apr 2018
Localizing Moments in Video with Natural Language Lisa Anne Hendricks Oliver Wang Eli Shechtman Josef Sivic Trevor Darrell Bryan C. Russell 133 949 0 04 Aug 2017
The "something something" video database for learning and evaluating visual common sense Raghav Goyal Samira Ebrahimi Kahou Vincent Michalski Joanna Materzynska S. Westphal ... Moritz Mueller-Freitag F. Hoppe Christian Thurau Ingo Bax Roland Memisevic VLM 108 1,542 0 13 Jun 2017
Dense-Captioning Events in Videos Ranjay Krishna Kenji Hata F. Ren Li Fei-Fei Juan Carlos Niebles 152 1,251 0 02 May 2017
TGIF-QA: Toward Spatio-Temporal Reasoning in Visual Question Answering Y. Jang Yale Song Youngjae Yu Youngjin Kim Gunhee Kim 89 562 0 14 Apr 2017
Convolutional Two-Stream Network Fusion for Video Action Recognition Christoph Feichtenhofer A. Pinz Andrew Zisserman 173 2,612 0 22 Apr 2016
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding Gunnar Sigurdsson Gül Varol Xinyu Wang Ali Farhadi Ivan Laptev Abhinav Gupta VGen 117 1,248 0 06 Apr 2016
Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan Andrew Zisserman 264 7,545 0 09 Jun 2014