Learning To Recognize Procedural Activities with Distant Supervision

Learning To Recognize Procedural Activities with Distant Supervision

26 January 2022

Gedas Bertasius

Marcus Rohrbach

Lorenzo Torresani

Papers citing "Learning To Recognize Procedural Activities with Distant Supervision"

18 / 68 papers shown

Title
Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering Hung-Ting Su Yulei Niu Xudong Lin Winston H. Hsu Shih-Fu Chang VGen ELM 21 6 0 07 Apr 2023
Procedure-Aware Pretraining for Instructional Video Understanding Honglu Zhou Roberto Martín-Martín Mubbasir Kapadia Silvio Savarese Juan Carlos Niebles 25 38 0 31 Mar 2023
Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations Yiwu Zhong Licheng Yu Yang Bai Shangwen Li Xueting Yan Yin Li AI4TS 32 31 0 31 Mar 2023
Selective Structured State-Spaces for Long-Form Video Understanding Jue Wang Wenjie Zhu Pichao Wang Xiang Yu Linda Liu Mohamed Omar Raffay Hamid 41 94 0 25 Mar 2023
Learning and Verification of Task Structure in Instructional Videos Medhini Narasimhan Licheng Yu Sean Bell Ning Zhang Trevor Darrell 68 19 0 23 Mar 2023
Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos Sixun Dong Huazhang Hu Dongze Lian Weixin Luo Yichen Qian Shenghua Gao ViT AI4TS 23 11 0 22 Mar 2023
Efficient Movie Scene Detection using State-Space Transformers Md. Mohaiminul Islam Mahmudul Hasan Kishan Athrey Tony Braskich Gedas Bertasius ViT 36 44 0 29 Dec 2022
Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning Yuchong Sun Hongwei Xue Ruihua Song Bei Liu Huan Yang Jianlong Fu AI4TS VLM 18 68 0 12 Oct 2022
Turbo Training with Token Dropout Tengda Han Weidi Xie Andrew Zisserman ViT 26 10 0 10 Oct 2022
Learning to Decompose Visual Features with Latent Textual Prompts Feng Wang Manling Li Xudong Lin Hairong Lv A. Schwing Heng Ji VLM 19 23 0 09 Oct 2022
EgoTaskQA: Understanding Human Tasks in Egocentric Videos Baoxiong Jia Ting Lei Song-Chun Zhu Siyuan Huang EgoV 30 61 0 08 Oct 2022
A Closer Look at Temporal Ordering in the Segmentation of Instructional Videos Anil Batra Shreyank N. Gowda Frank Keller Laura Sevilla-Lara 28 5 0 30 Sep 2022
Multimedia Generative Script Learning for Task Planning Qingyun Wang Manling Li Hou Pong Chan Lifu Huang J. Hockenmaier Girish Chowdhary Heng Ji VGen 29 10 0 25 Aug 2022
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval Xudong Lin Simran Tiwari Shiyuan Huang Manling Li Mike Zheng Shou Heng Ji Shih-Fu Chang 25 20 0 05 Jun 2022
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners Zhenhailong Wang Manling Li Ruochen Xu Luowei Zhou Jie Lei ... Chenguang Zhu Derek Hoiem Shih-Fu Chang Joey Tianyi Zhou Heng Ji MLLM VLM 170 137 0 22 May 2022
Long Movie Clip Classification with State-Space Video Models Md. Mohaiminul Islam Gedas Bertasius VLM 40 102 0 04 Apr 2022
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding Hu Xu Gargi Ghosh Po-Yao (Bernie) Huang Dmytro Okhonko Armen Aghajanyan Florian Metze Luke Zettlemoyer Florian Metze Luke Zettlemoyer Christoph Feichtenhofer CLIP VLM 259 558 0 28 Sep 2021
Is Space-Time Attention All You Need for Video Understanding? Gedas Bertasius Heng Wang Lorenzo Torresani ViT 280 1,982 0 09 Feb 2021