ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.07925
  4. Cited By
ActionFormer: Localizing Moments of Actions with Transformers
v1v2 (latest)

ActionFormer: Localizing Moments of Actions with Transformers

16 February 2022
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
    ViT
ArXiv (abs)PDFHTMLGithub (493★)

Papers citing "ActionFormer: Localizing Moments of Actions with Transformers"

50 / 205 papers shown
Title
Action Dubber: Timing Audible Actions via Inflectional Flow
Action Dubber: Timing Audible Actions via Inflectional Flow
Wenlong Wan
Weiying Zheng
Tianyi Xiang
Guiqing Li
Shengfeng He
27
0
0
16 Jun 2025
Improving Keystep Recognition in Ego-Video via Dexterous Focus
Improving Keystep Recognition in Ego-Video via Dexterous Focus
Zachary Chavis
Stephen J. Guy
Hyun Soo Park
38
0
0
01 Jun 2025
Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis
Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis
Vasilii Korolkov
18
0
0
31 May 2025
Detecting Informative Channels: ActionFormer
Detecting Informative Channels: ActionFormer
Kunpeng Zhao
Asahi Miyazaki
Tsuyoshi Okita
17
0
0
27 May 2025
DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos
DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos
Zijia Lu
A S M Iftekhar
Gaurav Mittal
Tianjian Meng
Xiawei Wang
Cheng Zhao
Rohith Kukkala
Ehsan Elhamifar
Mei Chen
77
0
0
22 May 2025
Generative AI for Autonomous Driving: A Review
Generative AI for Autonomous Driving: A Review
Katharina Winter
Abhishek Vivekanandan
Rupert Polley
Yinzhe Shen
Christian Schlauch
...
Christian Wirth
Omer Sahin Tas
Nadja Klein
Fabian B. Flohr
Hanno Gottschalk
94
0
0
21 May 2025
HiERO: understanding the hierarchy of human behavior enhances reasoning on egocentric videos
HiERO: understanding the hierarchy of human behavior enhances reasoning on egocentric videos
Simone Alberto Peirone
Francesca Pistilli
Giuseppe Averta
70
0
0
19 May 2025
SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation
SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation
Edoardo Bianchi
Antonio Liotta
59
0
0
13 May 2025
Object-Shot Enhanced Grounding Network for Egocentric Video
Object-Shot Enhanced Grounding Network for Egocentric Video
Yisen Feng
Haoyu Zhang
Meng Liu
Weili Guan
Liqiang Nie
80
3
0
07 May 2025
Reducing Annotation Burden in Physical Activity Research Using Vision-Language Models
Reducing Annotation Burden in Physical Activity Research Using Vision-Language Models
Abram Schonfeldt
Benjamin Maylor
Xiaofang Chen
Ronald Clark
Aiden Doherty
133
0
0
06 May 2025
Empowering Agentic Video Analytics Systems with Video Language Models
Empowering Agentic Video Analytics Systems with Video Language Models
Yuxuan Yan
Shiqi Jiang
Ting Cao
Yifan Yang
Qianqian Yang
Yuanchao Shu
Yue Yang
Lili Qiu
VLM
145
0
0
01 May 2025
Multi-Stage Boundary-Aware Transformer Network for Action Segmentation in Untrimmed Surgical Videos
Multi-Stage Boundary-Aware Transformer Network for Action Segmentation in Untrimmed Surgical Videos
Rezowan Shuvo
M S Mekala
Eyad Elyan
MedIm
425
0
0
26 Apr 2025
Bridge the Gap: From Weak to Full Supervision for Temporal Action Localization with PseudoFormer
Bridge the Gap: From Weak to Full Supervision for Temporal Action Localization with PseudoFormer
Ziyi Liu
Yang Liu
72
1
0
21 Apr 2025
Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection
Grounding-MD: Grounded Video-language Pre-training for Open-World Moment Detection
Weijun Zhuang
Qizhang Li
Xin Li
Ming-Yu Liu
Xiaopeng Hong
Feng Gao
Fan Yang
W. Zuo
79
0
0
20 Apr 2025
Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization
Hongwei Ji
Wulian Yun
Mengshi Qi
Huadong Ma
LRM
445
0
0
18 Apr 2025
Chain-of-Modality: Learning Manipulation Programs from Multimodal Human Videos with Vision-Language-Models
Chain-of-Modality: Learning Manipulation Programs from Multimodal Human Videos with Vision-Language-Models
Chen Wang
Fei Xia
Wenhao Yu
Tingnan Zhang
Ruohan Zhang
Ce Liu
Li Fei-Fei
Jie Tan
Jacky Liang
80
1
0
17 Apr 2025
F$^3$Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
F3^33Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
Zhaoyu Liu
Kan Jiang
Murong Ma
Zhe Hou
Yun Lin
Jin Song Dong
73
0
0
11 Apr 2025
SEVERE++: Evaluating Benchmark Sensitivity in Generalization of Video Representation Learning
SEVERE++: Evaluating Benchmark Sensitivity in Generalization of Video Representation Learning
Fida Mohammad Thoker
Letian Jiang
Chen Zhao
Piyush Bagad
Hazel Doughty
Bernard Ghanem
Cees G. M. Snoek
ViTSSL
118
0
0
08 Apr 2025
SocialGesture: Delving into Multi-person Gesture Understanding
SocialGesture: Delving into Multi-person Gesture Understanding
Xu Cao
Pranav Virupaksha
Wenqi Jia
Bolin Lai
Fiona Ryan
Sangmin Lee
James M. Rehg
SLR
91
0
0
03 Apr 2025
SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
Fida Mohammad Thoker
Letian Jiang
Chen Zhao
Bernard Ghanem
140
0
0
01 Apr 2025
FDDet: Frequency-Decoupling for Boundary Refinement in Temporal Action Detection
FDDet: Frequency-Decoupling for Boundary Refinement in Temporal Action Detection
Xinnan Zhu
Yicheng Zhu
Tixin Chen
Wentao Wu
Yuanjie Dang
116
0
0
01 Apr 2025
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
Lucas Ventura
Antoine Yang
Cordelia Schmid
Gül Varol
90
0
0
31 Mar 2025
Towards Precise Action Spotting: Addressing Temporal Misalignment in Labels with Dynamic Label Assignment
Towards Precise Action Spotting: Addressing Temporal Misalignment in Labels with Dynamic Label Assignment
Masato Tamura
84
0
0
31 Mar 2025
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Wei-Jin Huang
Yuan-Ming Li
Zhi-Wei Xia
Yu-Ming Tang
Kun-Yu Lin
Jian-Fang Hu
Wei-Shi Zheng
103
0
0
28 Mar 2025
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang
Fadime Sener
Angela Yao
OffRL
125
2
0
24 Mar 2025
Temporal Action Detection Model Compression by Progressive Block Drop
Temporal Action Detection Model Compression by Progressive Block Drop
Xiaoyong Chen
Yong Guo
Jiaming Liang
Sitong Zhuang
Runhao Zeng
Xiping Hu
90
0
0
21 Mar 2025
TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos
Chen-Da Liu-Zhang
Lin Sui
Shuming Liu
Fangzhou Mu
Ziyi Wang
Bernard Ghanem
110
2
0
09 Mar 2025
SGA-INTERACT: A 3D Skeleton-based Benchmark for Group Activity Understanding in Modern Basketball Tactic
Yue Yang
Wei Wang
Yifei Liu
Linfeng Dong
Hao Wu
Mingxin Zhang
Zhihang Zhong
Xiao-Fu Sun
82
1
0
09 Mar 2025
End-to-End Action Segmentation Transformer
Tieqiao Wang
Sinisa Todorovic
ViT
87
0
0
08 Mar 2025
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection
Shuming Liu
Chen Zhao
Fatimah Zohra
Mattia Soldan
Alejandro Pardo
...
Juan Carlos León Alcázar
A. Cioppa
Silvio Giancola
Carlos Hinojosa
Bernard Ghanem
110
3
0
27 Feb 2025
SurgPLAN++: Universal Surgical Phase Localization Network for Online and Offline Inference
SurgPLAN++: Universal Surgical Phase Localization Network for Online and Offline Inference
Zhen Chen
Xingjian Luo
Jinlin Wu
Long Bai
Zhen Lei
Hongliang Ren
Sebastien Ourselin
Hongbin Liu
153
1
0
17 Feb 2025
Do Language Models Understand Time?
Do Language Models Understand Time?
Xi Ding
Lei Wang
325
2
0
18 Dec 2024
Training Strategies for Isolated Sign Language Recognition
Training Strategies for Isolated Sign Language Recognition
Karina Kvanchiani
Roman Kraynov
Elizaveta Petrova
Petr Surovcev
Aleksandr Nagaev
A. Kapitanov
161
1
0
16 Dec 2024
Speech-Forensics: Towards Comprehensive Synthetic Speech Dataset
  Establishment and Analysis
Speech-Forensics: Towards Comprehensive Synthetic Speech Dataset Establishment and Analysis
Zhoulin Ji
Chenhao Lin
Hang Wang
Chao Shen
170
1
0
12 Dec 2024
TimeRefine: Temporal Grounding with Time Refining Video LLM
TimeRefine: Temporal Grounding with Time Refining Video LLM
Xizi Wang
Feng Cheng
Ziyang Wang
Huiyu Wang
Md. Mohaiminul Islam
Lorenzo Torresani
Joey Tianyi Zhou
Gedas Bertasius
David J. Crandall
191
2
0
12 Dec 2024
Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA
  Benchmark
Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark
Joseph Heyward
João Carreira
Dima Damen
Andrew Zisserman
Viorica Patraucean
135
2
0
29 Nov 2024
DiMoDif: Discourse Modality-information Differentiation for Audio-visual Deepfake Detection and Localization
DiMoDif: Discourse Modality-information Differentiation for Audio-visual Deepfake Detection and Localization
C. Koutlis
Symeon Papadopoulos
124
4
0
15 Nov 2024
PESFormer: Boosting Macro- and Micro-expression Spotting with Direct
  Timestamp Encoding
PESFormer: Boosting Macro- and Micro-expression Spotting with Direct Timestamp Encoding
Wang-Wang Yu
Kai-Fu Yang
Xiangrui Hu
Jingwen Jiang
Hong-Mei Yan
Yong-Jie Li
62
0
0
24 Oct 2024
ContextDet: Temporal Action Detection with Adaptive Context Aggregation
ContextDet: Temporal Action Detection with Adaptive Context Aggregation
Ning Wang
Yun Xiao
Xiaopeng Peng
Xiaojun Chang
Xuanhong Wang
Dingyi Fang
102
2
0
20 Oct 2024
Zero-shot Action Localization via the Confidence of Large Vision-Language Models
Zero-shot Action Localization via the Confidence of Large Vision-Language Models
Josiah Aklilu
Xiaohan Wang
Serena Yeung-Levy
122
1
0
18 Oct 2024
The Solution for Temporal Action Localisation Task of Perception Test
  Challenge 2024
The Solution for Temporal Action Localisation Task of Perception Test Challenge 2024
Yinan Han
Qingyuan Jiang
Hongming Mei
Yang Yang
Jinhui Tang
88
0
0
08 Oct 2024
EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos Referring to Procedural Texts
EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos Referring to Procedural Texts
Yuto Haneji
Taichi Nishimura
Hirotaka Kameko
Keisuke Shirai
Tomoya Yoshida
Keiya Kajimura
Koki Yamamoto
Taiyu Cui
Tomohiro Nishimoto
Shinsuke Mori
EgoV
84
0
0
07 Oct 2024
Solution for Temporal Sound Localisation Task of ECCV Second Perception
  Test Challenge 2024
Solution for Temporal Sound Localisation Task of ECCV Second Perception Test Challenge 2024
Haowei Gu
Weihao Zhu
Yang Yang
78
0
0
29 Sep 2024
Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks
Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks
Min Yang
Zichen Zhang
Limin Wang
AI4TS
71
0
0
27 Sep 2024
AMEGO: Active Memory from long EGOcentric videos
AMEGO: Active Memory from long EGOcentric videos
Gabriele Goletto
Tushar Nagarajan
Giuseppe Averta
Dima Damen
EgoV
89
7
0
17 Sep 2024
HAVANA: Hierarchical stochastic neighbor embedding for Accelerated Video
  ANnotAtions
HAVANA: Hierarchical stochastic neighbor embedding for Accelerated Video ANnotAtions
Alexandru Bobe
Jan van Gemert
36
0
0
16 Sep 2024
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
Ling Xing
Hongyu Qu
Rui Yan
Xiangbo Shu
Jinhui Tang
161
2
0
12 Sep 2024
Introducing Gating and Context into Temporal Action Detection
Introducing Gating and Context into Temporal Action Detection
Aglind Reka
Diana Laura Borza
Dominick Reilly
Michal Balazia
Francois Bremond
78
0
0
06 Sep 2024
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Keyne Oei
Amr Gomaa
Anna Maria Feit
João Belo
99
0
0
06 Sep 2024
HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics
HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics
Gueter Josmy Faure
Jia-Fong Yeh
Min-Hung Chen
Hung-Ting Su
S. Lai
Winston H. Hsu
85
3
0
30 Aug 2024
12345
Next