Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.07925
Cited By
ActionFormer: Localizing Moments of Actions with Transformers
16 February 2022
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ActionFormer: Localizing Moments of Actions with Transformers"
50 / 68 papers shown
Title
SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation
Edoardo Bianchi
Antonio Liotta
29
0
0
13 May 2025
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
41
0
0
07 May 2025
Reducing Annotation Burden in Physical Activity Research Using Vision-Language Models
Abram Schonfeldt
Benjamin Maylor
Xiaofang Chen
Ronald Clark
Aiden Doherty
68
0
0
06 May 2025
Empowering Agentic Video Analytics Systems with Video Language Models
Yuxuan Yan
Shiqi Jiang
Ting Cao
Xuming Hu
Qianqian Yang
Yuanchao Shu
Y. Yang
Lili Qiu
VLM
70
0
0
01 May 2025
Multi-Stage Boundary-Aware Transformer Network for Action Segmentation in Untrimmed Surgical Videos
Rezowan Shuvo
M S Mekala
Eyad Elyan
MedIm
131
0
0
26 Apr 2025
Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Hongwei Ji
Wulian Yun
Mengshi Qi
Huadong Ma
LRM
159
0
0
18 Apr 2025
Chain-of-Modality: Learning Manipulation Programs from Multimodal Human Videos with Vision-Language-Models
Chen Wang
Fei Xia
Wenhao Yu
Tingnan Zhang
Ruohan Zhang
Ce Liu
Li Fei-Fei
Jie Tan
Jacky Liang
33
0
0
17 Apr 2025
F
3
^3
3
Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
Zhaoyu Liu
Kan Jiang
Murong Ma
Zhé Hóu
Yun Lin
J. Dong
37
0
0
11 Apr 2025
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Wei-Jin Huang
Yuan-Ming Li
Zhi-Wei Xia
Yu-Ming Tang
Kun-Yu Lin
Jian-Fang Hu
Wei-Shi Zheng
47
0
0
28 Mar 2025
End-to-End Action Segmentation Transformer
Tieqiao Wang
Sinisa Todorovic
ViT
39
0
0
08 Mar 2025
SurgPLAN++: Universal Surgical Phase Localization Network for Online and Offline Inference
Zhen Chen
Xingjian Luo
Jinlin Wu
Long Bai
Zhen Lei
Hongliang Ren
Sebastien Ourselin
Hongbin Liu
61
0
0
17 Feb 2025
Do Language Models Understand Time?
Xi Ding
Lei Wang
178
0
0
18 Dec 2024
Training Strategies for Isolated Sign Language Recognition
Karina Kvanchiani
Roman Kraynov
Elizaveta Petrova
Petr Surovcev
Aleksandr Nagaev
A. Kapitanov
76
1
0
16 Dec 2024
TimeRefine: Temporal Grounding with Time Refining Video LLM
Xizi Wang
Feng Cheng
Ziyang Wang
Huiyu Wang
Md. Mohaiminul Islam
Lorenzo Torresani
Joey Tianyi Zhou
Gedas Bertasius
David J. Crandall
109
1
0
12 Dec 2024
DiMoDif: Discourse Modality-information Differentiation for Audio-visual Deepfake Detection and Localization
C. Koutlis
Symeon Papadopoulos
58
2
0
15 Nov 2024
Zero-shot Action Localization via the Confidence of Large Vision-Language Models
Josiah Aklilu
Xiaohan Wang
Serena Yeung-Levy
57
1
0
18 Oct 2024
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
Ling Xing
Hongyu Qu
Rui Yan
Xiangbo Shu
Jinhui Tang
45
1
0
12 Sep 2024
Introducing Gating and Context into Temporal Action Detection
Aglind Reka
Diana Laura Borza
Dominick Reilly
Michal Balazia
Francois Bremond
23
0
0
06 Sep 2024
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Keyne Oei
Amr Gomaa
Anna Maria Feit
João Belo
28
0
0
06 Sep 2024
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming Videos
Hyolim Kang
Jeongseok Hyun
Joungbin An
Youngjae Yu
Seon Joo Kim
38
0
0
17 Jul 2024
MMAD: Multi-label Micro-Action Detection in Videos
Kun Li
Pengyu Liu
Pengyu Liu
Guoliang Chen
Zhiliang Wu
Hehe Fan
Meng Wang
42
7
0
07 Jul 2024
Open-Vocabulary Temporal Action Localization using Multimodal Guidance
Akshita Gupta
Aditya Arora
Sanath Narayan
Salman Khan
F. Khan
Graham W. Taylor
38
3
0
21 Jun 2024
MALT: Multi-scale Action Learning Transformer for Online Action Detection
Zhipeng Yang
Ruoyu Wang
Yang Tan
Liping Xie
OffRL
43
1
0
31 May 2024
Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions
Runhao Zeng
Xiaoyong Chen
Jiaming Liang
Huisi Wu
Guangzhong Cao
Yong Guo
AAML
39
3
0
29 Mar 2024
PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization
Edward Fish
Jon Weinbren
Andrew Gilbert
42
1
0
27 Mar 2024
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
David Pujol-Perich
Albert Clapés
Sergio Escalera
31
0
0
20 Dec 2023
Enhancing Single-Frame Supervision for Better Temporal Action Localization
Changjian Chen
Jiashu Chen
Weikai Yang
Haoze Wang
Johannes Knittel
Xibin Zhao
Steffen Koch
Thomas Ertl
Shixia Liu
26
3
0
08 Dec 2023
Low-power, Continuous Remote Behavioral Localization with Event Cameras
Friedhelm Hamann
Suman Ghosh
Ignacio Juarez Martinez
Tom Hart
Alex Kacelnik
Guillermo Gallego
24
7
0
06 Dec 2023
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
Shuming Liu
Chen-Da Liu-Zhang
Chen Zhao
Bernard Ghanem
33
25
0
28 Nov 2023
Boundary Discretization and Reliable Classification Network for Temporal Action Detection
Zhenying Fang
Jun Yu
Richang Hong
26
0
0
10 Oct 2023
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
31
53
0
21 Aug 2023
MGMAE: Motion Guided Masking for Video Masked Autoencoding
Bingkun Huang
Zhiyu Zhao
Guozhen Zhang
Yu Qiao
Limin Wang
39
30
0
21 Aug 2023
NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to the Ego4D Moment Queries Challenge 2023
Lin Sui
Fangzhou Mu
Yin Li
22
3
0
05 Jul 2023
Deep Neural Networks in Video Human Action Recognition: A Review
Zihan Wang
Yang Yang
Zhi Liu
Y. Zheng
53
4
0
25 May 2023
Decomposed Cross-modal Distillation for RGB-based Temporal Action Detection
Pilhyeon Lee
Taeoh Kim
Minho Shim
Dongyoon Wee
H. Byun
30
11
0
30 Mar 2023
DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion
Sauradip Nag
Xiatian Zhu
Jiankang Deng
Yi-Zhe Song
Tao Xiang
DiffM
VGen
41
21
0
27 Mar 2023
Multi-modal Prompting for Low-Shot Temporal Action Localization
Chen Ju
Zeqian Li
Peisen Zhao
Ya-Qin Zhang
Xiaopeng Zhang
Qi Tian
Yanfeng Wang
Weidi Xie
36
18
0
21 Mar 2023
TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization
Tuan N. Tang
Kwonyoung Kim
K. Sohn
18
29
0
16 Mar 2023
Co-Occurrence Matters: Learning Action Relation for Temporal Action Localization
Congqi Cao
Yizhe Wang
Yuelie Lu
X. Zhang
Yanning Zhang
33
4
0
15 Mar 2023
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang
Arsha Nagrani
Paul Hongsuck Seo
Antoine Miech
Jordi Pont-Tuset
Ivan Laptev
Josef Sivic
Cordelia Schmid
AI4TS
VLM
36
220
0
27 Feb 2023
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal
E. Mavroudi
Xitong Yang
Sainbayar Sukhbaatar
Leonid Sigal
Matt Feiszli
Lorenzo Torresani
Du Tran
23
7
0
16 Feb 2023
Epic-Sounds: A Large-scale Dataset of Actions That Sound
Jaesung Huh
Jacob Chalk
Evangelos Kazakos
Dima Damen
Andrew Zisserman
EgoV
18
41
0
01 Feb 2023
HierVL: Learning Hierarchical Video-Language Embeddings
Kumar Ashutosh
Rohit Girdhar
Lorenzo Torresani
Kristen Grauman
VLM
AI4TS
22
51
0
05 Jan 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring
Huiyu Wang
Mitesh Singh
Lorenzo Torresani
EgoV
72
23
0
03 Jan 2023
Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges and Future Directions
Pratik K. Mishra
Alex Mihailidis
Shehroz S. Khan
28
17
0
31 Dec 2022
InternVideo: General Video Foundation Models via Generative and Discriminative Learning
Yi Wang
Kunchang Li
Yizhuo Li
Yinan He
Bingkun Huang
...
Junting Pan
Jiashuo Yu
Yali Wang
Limin Wang
Yu Qiao
VLM
VGen
52
309
0
06 Dec 2022
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
Chen Zhao
Shuming Liu
K. Mangalam
Bernard Ghanem
33
17
0
25 Nov 2022
Data Augmentation Vision Transformer for Fine-grained Image Classification
Chao Hu
Liqiang Zhu
Weibin Qiu
Weijie Wu
ViT
38
3
0
23 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
30
106
0
17 Nov 2022
Where a Strong Backbone Meets Strong Features -- ActionFormer for Ego4D Moment Queries Challenge
Fangzhou Mu
Sicheng Mo
Gillian Wang
Yin Li
27
3
0
16 Nov 2022
1
2
Next