Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.05721
Cited By
HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge Understanding
9 July 2023
Hao Zheng
R. Lee
Yuqian Lu
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge Understanding"
19 / 19 papers shown
Title
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
96
113
0
17 Nov 2022
Video-based Human-Object Interaction Detection from Tubelet Tokens
Danyang Tu
Wei Sun
Xiongkuo Min
Guangtao Zhai
Wei Shen
ViT
83
17
0
04 Jun 2022
Future Transformer for Long-term Action Anticipation
Dayoung Gong
Joonseok Lee
Manjin Kim
S. Ha
Minsu Cho
AI4TS
46
66
0
27 May 2022
PYSKL: Towards Good Practices for Skeleton Action Recognition
Haodong Duan
Jiaqi Wang
Kai-xiang Chen
Dahua Lin
VLM
57
143
0
19 May 2022
Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities
Fadime Sener
Dibyadip Chatterjee
Daniel Shelepov
Kun He
Dipika Singhania
Robert Y. Wang
Angela Yao
VGen
75
214
0
28 Mar 2022
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Hao Zhang
Feng Li
Shilong Liu
Lei Zhang
Hang Su
Jun Zhu
L. Ni
H. Shum
ViT
164
1,435
0
07 Mar 2022
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
145
668
0
16 Dec 2021
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Yanghao Li
Chaoxia Wu
Haoqi Fan
K. Mangalam
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
146
689
0
02 Dec 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
367
2,053
0
09 Feb 2021
Temporal Relational Modeling with Self-Supervision for Action Segmentation
Dong Wang
Di Hu
Xingjian Li
Dejing Dou
57
53
0
14 Dec 2020
The MECCANO Dataset: Understanding Human-Object Interactions from Egocentric Videos in an Industrial-like Domain
Francesco Ragusa
Antonino Furnari
S. Livatino
G. Farinella
EgoV
40
100
0
12 Oct 2020
The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose
Yizhak Ben-Shabat
Xin Yu
F. Saleh
Dylan Campbell
Cristian Rodriguez-Opazo
Hongdong Li
Stephen Gould
69
114
0
01 Jul 2020
MMDetection: Open MMLab Detection Toolbox and Benchmark
Kai-xiang Chen
Jiaqi Wang
Jiangmiao Pang
Yuhang Cao
Yu Xiong
...
Jingdong Wang
Jianping Shi
Wanli Ouyang
Chen Change Loy
Dahua Lin
VOS
151
2,868
0
17 Jun 2019
TSM: Temporal Shift Module for Efficient Video Understanding
Ji Lin
Chuang Gan
Song Han
98
1,691
0
20 Nov 2018
When will you do what? - Anticipating Temporal Occurrences of Activities
Yazan Abu Farha
Alexander Richard
Juergen Gall
65
191
0
03 Apr 2018
Datasheets for Datasets
Timnit Gebru
Jamie Morgenstern
Briana Vecchione
Jennifer Wortman Vaughan
Hanna M. Wallach
Hal Daumé
Kate Crawford
261
2,184
0
23 Mar 2018
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition
Sijie Yan
Yuanjun Xiong
Dahua Lin
GNN
241
4,169
0
23 Jan 2018
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
João Carreira
Andrew Zisserman
232
8,019
0
22 May 2017
Simple Online and Realtime Tracking
Alex Bewley
Zongyuan Ge
Lionel Ott
F. Ramos
B. Upcroft
VOT
84
3,095
0
02 Feb 2016
1