ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.07925
  4. Cited By
ActionFormer: Localizing Moments of Actions with Transformers
v1v2 (latest)

ActionFormer: Localizing Moments of Actions with Transformers

16 February 2022
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
    ViT
ArXiv (abs)PDFHTMLGithub (493★)

Papers citing "ActionFormer: Localizing Moments of Actions with Transformers"

50 / 205 papers shown
Title
FineBio: A Fine-Grained Video Dataset of Biological Experiments with
  Hierarchical Annotation
FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation
Takuma Yagi
Misaki Ohashi
Yifei Huang
Ryosuke Furuta
Shungo Adachi
Toutai Mitsuyama
Yoichi Sato
72
6
0
01 Feb 2024
Computer Vision for Primate Behavior Analysis in the Wild
Computer Vision for Primate Behavior Analysis in the Wild
Richard Vogg
Timo Lüddecke
Jonathan Henrich
Sharmita Dey
Matthias Nuske
...
Alexander Gail
Stefan Treue
H. Scherberger
Florentin Wörgötter
Alexander S. Ecker
129
6
0
29 Jan 2024
Multi-modal News Understanding with Professionally Labelled Videos
  (ReutersViLNews)
Multi-modal News Understanding with Professionally Labelled Videos (ReutersViLNews)
Shih-Han Chou
Matthew Kowal
Yasmin Niknam
Diana Moyano
Shayaan Mehdi
...
Cheng Zhang
Ian Knopke
S. Kocak
Leonid Sigal
Yalda Mohsenzadeh
137
1
0
23 Jan 2024
Detours for Navigating Instructional Videos
Detours for Navigating Instructional Videos
Kumar Ashutosh
Zihui Xue
Tushar Nagarajan
Kristen Grauman
127
6
0
03 Jan 2024
Retrieval-Augmented Egocentric Video Captioning
Retrieval-Augmented Egocentric Video Captioning
Jilan Xu
Yifei Huang
Junlin Hou
Guo Chen
Yue Zhang
Rui Feng
Weidi Xie
EgoV
174
37
0
01 Jan 2024
CaptainCook4D: A dataset for understanding errors in procedural
  activities
CaptainCook4D: A dataset for understanding errors in procedural activities
Rohith Peddi
Shivvrat Arya
B. Challa
Likhitha Pallapothula
Akshay Vyas
...
Vasundhara Komaragiri
Eric D. Ragan
Nicholas Ruozzi
Yu Xiang
Vibhav Gogate
102
14
0
22 Dec 2023
Perception Test 2023: A Summary of the First Challenge And Outcome
Perception Test 2023: A Summary of the First Challenge And Outcome
Joseph Heyward
João Carreira
Dima Damen
Andrew Zisserman
Viorica Patraucean
63
0
0
20 Dec 2023
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
David Pujol-Perich
Albert Clapés
Sergio Escalera
169
0
0
20 Dec 2023
Text-Conditioned Resampler For Long Form Video Understanding
Text-Conditioned Resampler For Long Form Video Understanding
Bruno Korbar
Yongqin Xian
A. Tonioni
Andrew Zisserman
Federico Tombari
108
12
0
19 Dec 2023
Grounded Question-Answering in Long Egocentric Videos
Grounded Question-Answering in Long Egocentric Videos
Shangzhe Di
Weidi Xie
134
27
0
11 Dec 2023
Enhancing Single-Frame Supervision for Better Temporal Action
  Localization
Enhancing Single-Frame Supervision for Better Temporal Action Localization
Changjian Chen
Jiashu Chen
Weikai Yang
Haoze Wang
Johannes Knittel
Xibin Zhao
Steffen Koch
Thomas Ertl
Shixia Liu
71
4
0
08 Dec 2023
MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie
  Understanding
MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding
Hongjie Zhang
Yi Liu
Lu Dong
Yifei Huang
Z. Ling
Yali Wang
Limin Wang
Yu Qiao
93
31
0
08 Dec 2023
Low-power, Continuous Remote Behavioral Localization with Event Cameras
Low-power, Continuous Remote Behavioral Localization with Event Cameras
Friedhelm Hamann
Suman Ghosh
Ignacio Juarez Martinez
Tom Hart
Alex Kacelnik
Guillermo Gallego
67
8
0
06 Dec 2023
Adapting Short-Term Transformers for Action Detection in Untrimmed
  Videos
Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
Min Yang
Huan Gao
Ping Guo
Limin Wang
ViT
87
6
0
04 Dec 2023
End-to-End Temporal Action Detection with 1B Parameters Across 1000
  Frames
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
Shuming Liu
Chen-Da Liu-Zhang
Chen Zhao
Guohao Li
119
29
0
28 Nov 2023
Centre Stage: Centricity-based Audio-Visual Temporal Action Detection
Centre Stage: Centricity-based Audio-Visual Temporal Action Detection
Hanyuan Wang
Majid Mirmehdi
Dima Damen
Toby Perrett
82
2
0
28 Nov 2023
ADM-Loc: Actionness Distribution Modeling for Point-supervised Temporal
  Action Localization
ADM-Loc: Actionness Distribution Modeling for Point-supervised Temporal Action Localization
Elahe Vahdani
Yingli Tian
77
0
0
27 Nov 2023
Temporal Action Localization for Inertial-based Human Activity
  Recognition
Temporal Action Localization for Inertial-based Human Activity Recognition
Marius Bock
Michael Moeller
Kristof Van Laerhoven
65
0
0
27 Nov 2023
AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset
AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset
Zhixi Cai
Shreya Ghosh
Aman Pankaj Adatia
Munawar Hayat
Abhinav Dhall
Kalin Stefanov
83
37
0
26 Nov 2023
SurgPLAN: Surgical Phase Localization Network for Phase Recognition
SurgPLAN: Surgical Phase Localization Network for Phase Recognition
Xingjian Luo
You Pang
Zhen Chen
Jinlin Wu
Zongmin Zhang
Zhen Lei
Hongbin Liu
71
8
0
16 Nov 2023
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot
  End-to-End Temporal Action Detection
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection
Thinh Phan
Khoa T. Vo
Duy Le
Gianfranco Doretto
Don Adjeroh
Ngan Le
VLM
91
12
0
01 Nov 2023
A Hybrid Graph Network for Complex Activity Detection in Video
A Hybrid Graph Network for Complex Activity Detection in Video
Salman Khan
Izzeddin Teeti
Andrew Bradley
Mohamed Elhoseiny
Fabio Cuzzolin
85
2
0
26 Oct 2023
POTLoc: Pseudo-Label Oriented Transformer for Point-Supervised Temporal
  Action Localization
POTLoc: Pseudo-Label Oriented Transformer for Point-Supervised Temporal Action Localization
Elahe Vahdani
Yingli Tian
66
1
0
20 Oct 2023
NurViD: A Large Expert-Level Video Database for Nursing Procedure
  Activity Understanding
NurViD: A Large Expert-Level Video Database for Nursing Procedure Activity Understanding
Ming Hu
Lin Wang
Siyuan Yan
Don Ma
Qingli Ren
Peng Xia
Wei Feng
Peibo Duan
Lie Ju
Zongyuan Ge
85
15
0
20 Oct 2023
Boundary Discretization and Reliable Classification Network for Temporal
  Action Detection
Boundary Discretization and Reliable Classification Network for Temporal Action Detection
Zhenying Fang
Jun Yu
Richang Hong
71
0
0
10 Oct 2023
Multi-Resolution Audio-Visual Feature Fusion for Temporal Action
  Localization
Multi-Resolution Audio-Visual Feature Fusion for Temporal Action Localization
Edward Fish
Jon Weinbren
Andrew Gilbert
47
0
0
05 Oct 2023
ENIGMA-51: Towards a Fine-Grained Understanding of Human-Object
  Interactions in Industrial Scenarios
ENIGMA-51: Towards a Fine-Grained Understanding of Human-Object Interactions in Industrial Scenarios
Francesco Ragusa
Rosario Leonardi
Michele Mazzamuto
Claudia Bonanno
Rosario Scavo
Antonino Furnari
G. Farinella
67
7
0
26 Sep 2023
VidChapters-7M: Video Chapters at Scale
VidChapters-7M: Video Chapters at Scale
Antoine Yang
Arsha Nagrani
Ivan Laptev
Josef Sivic
Cordelia Schmid
VGen
98
28
0
25 Sep 2023
Boundary-Aware Proposal Generation Method for Temporal Action
  Localization
Boundary-Aware Proposal Generation Method for Temporal Action Localization
Hao Zhang
Chunyan Feng
Jiahui Yang
Zheng Li
Caili Guo
48
0
0
25 Sep 2023
Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for
  Long-form Video Understanding
Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for Long-form Video Understanding
Mohamed Afham
Satya Narayan Shukla
Omid Poursaeed
Pengchuan Zhang
Ashish Shah
Sernam Lim
VLM
56
2
0
20 Sep 2023
Temporal Action Localization with Enhanced Instant Discriminability
Temporal Action Localization with Enhanced Instant Discriminability
Ding Shi
Qiong Cao
Yujie Zhong
Shan An
Jian Cheng
Haogang Zhu
Dacheng Tao
89
9
0
11 Sep 2023
UMMAFormer: A Universal Multimodal-adaptive Transformer Framework for
  Temporal Forgery Localization
UMMAFormer: A Universal Multimodal-adaptive Transformer Framework for Temporal Forgery Localization
Rui Zhang
Hongxia Wang
Ming-han Du
Hanqing Liu
Yangqiaoyu Zhou
Q. Zeng
93
24
0
28 Aug 2023
Benchmarking Data Efficiency and Computational Efficiency of Temporal
  Action Localization Models
Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models
Jan Warchocki
Teodor Oprescu
Yunhan Wang
Alexandru Damacus
Paul Misterka
Robert-Jan Bruintjes
A. Lengyel
Ombretta Strafforello
Jan van Gemert
28
2
0
24 Aug 2023
HR-Pro: Point-supervised Temporal Action Localization via Hierarchical
  Reliability Propagation
HR-Pro: Point-supervised Temporal Action Localization via Hierarchical Reliability Propagation
Han Zhang
Xiang Wang
Xiaohao Xu
Zhiwu Qing
Changxin Gao
Nong Sang
75
14
0
24 Aug 2023
UnLoc: A Unified Framework for Video Localization Tasks
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
136
55
0
21 Aug 2023
MGMAE: Motion Guided Masking for Video Masked Autoencoding
MGMAE: Motion Guided Masking for Video Masked Autoencoding
Bingkun Huang
Zhiyu Zhao
Guozhen Zhang
Yu Qiao
Limin Wang
73
36
0
21 Aug 2023
Self-Feedback DETR for Temporal Action Detection
Self-Feedback DETR for Temporal Action Detection
Jihwan Kim
Miso Lee
Jae-Pil Heo
100
19
0
21 Aug 2023
Memory-and-Anticipation Transformer for Online Action Understanding
Memory-and-Anticipation Transformer for Online Action Understanding
Jiahao Wang
Guo Chen
Yifei Huang
Liming Wang
Tong Lu
OffRL
137
43
0
15 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding
Temporally-Adaptive Models for Efficient Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Yingya Zhang
Ziwei Liu
Marcelo H. Ang
78
10
0
10 Aug 2023
PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
Faegheh Sardari
A. Mustafa
Philip J. B. Jackson
A. Hilton
ViT
96
6
0
09 Aug 2023
VideoGLUE: Video General Understanding Evaluation of Foundation Models
VideoGLUE: Video General Understanding Evaluation of Foundation Models
Liangzhe Yuan
N. B. Gundavarapu
Long Zhao
Hao Zhou
Huayu Chen
...
Florian Schroff
Hartwig Adam
Ming-Hsuan Yang
Ting Liu
Boqing Gong
ELM
85
10
0
06 Jul 2023
NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to
  the Ego4D Moment Queries Challenge 2023
NMS Threshold matters for Ego4D Moment Queries -- 2nd place solution to the Ego4D Moment Queries Challenge 2023
Lin Sui
Fangzhou Mu
Yin Li
54
3
0
05 Jul 2023
GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
GroundNLQ @ Ego4D Natural Language Queries Challenge 2023
Zhijian Hou
Lei Ji
Difei Gao
Wanjun Zhong
Kun Yan
Chong Li
W. Chan
Chong-Wah Ngo
Nan Duan
Mike Zheng Shou
97
17
0
27 Jun 2023
Action Sensitivity Learning for the Ego4D Episodic Memory Challenge 2023
Action Sensitivity Learning for the Ego4D Episodic Memory Challenge 2023
Jiayi Shao
Xiaohan Wang
Ruijie Quan
Yezhou Yang
EgoV
58
8
0
15 Jun 2023
MammalNet: A Large-scale Video Benchmark for Mammal Recognition and
  Behavior Understanding
MammalNet: A Large-scale Video Benchmark for Mammal Recognition and Behavior Understanding
Jun Chen
Ming Hu
D. Coker
M. Berumen
Blair R. Costelloe
Sara Beery
Anna Rohrbach
Mohamed Elhoseiny
86
25
0
01 Jun 2023
A Multi-Modal Transformer Network for Action Detection
A Multi-Modal Transformer Network for Action Detection
Matthew Korban
Scott T. Acton
Peter Youngs
ViT
56
15
0
31 May 2023
Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal
  Action Localization
Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization
Huantao Ren
Wenfei Yang
Tianzhu Zhang
Yongdong Zhang
96
30
0
29 May 2023
Action Sensitivity Learning for Temporal Action Localization
Action Sensitivity Learning for Temporal Action Localization
Jiayi Shao
Xiaohan Wang
Ruijie Quan
Junjun Zheng
Jiang Yang
Yezhou Yang
126
24
0
25 May 2023
Deep Neural Networks in Video Human Action Recognition: A Review
Deep Neural Networks in Video Human Action Recognition: A Review
Zihan Wang
Yang Yang
Zhi Liu
Y. Zheng
89
5
0
25 May 2023
Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Perception Test: A Diagnostic Benchmark for Multimodal Video Models
Viorica Puatruaucean
Lucas Smaira
Ankush Gupta
Adrià Recasens Continente
L. Markeeva
...
Y. Aytar
Simon Osindero
Dima Damen
Andrew Zisserman
João Carreira
VLM
232
179
0
23 May 2023
Previous
12345
Next