Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.09245
Cited By
Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos
18 August 2023
Zhiqiang Shen
Xiaoxiao Sheng
Hehe Fan
Longguang Wang
Y. Guo
Qiong Liu
Hao Wen
Xiaoping Zhou
3DPC
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos"
38 / 38 papers shown
Title
Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos
Zhi Zuo
Chenyi Zhuang
Zhiqiang Shen
Pan Gao
Jie Qin
Nicu Sebe
3DPC
108
0
0
07 Apr 2025
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
113
8
0
13 Aug 2024
Contrastive Predictive Autoencoders for Dynamic Point Cloud Self-Supervised Learning
Xiaoxiao Sheng
Zhiqiang Shen
Gang Xiao
3DPC
SSL
43
6
0
22 May 2023
PointCMP: Contrastive Mask Prediction for Self-supervised Learning on Point Cloud Videos
Zhiqiang Shen
Xiaoxiao Sheng
Longguang Wang
Y. Guo
Qiong Liu
Xiaoping Zhou
3DPC
SSL
62
14
0
06 May 2023
Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning
Zhuoyang Zhang
Yu Dong
Yunze Liu
Li Yi
3DPC
AI4TS
53
19
0
10 Dec 2022
Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding
Hao Wen
Yunze Liu
Jingwei Huang
Bokun Duan
Li Yi
ViT
3DPC
75
27
0
30 Jul 2022
Static and Dynamic Concepts for Self-supervised Video Representation Learning
Rui Qian
Shuangrui Ding
Xian Liu
Dahua Lin
SSL
43
25
0
26 Jul 2022
PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences
Hehe Fan
Xin Yu
Yuhang Ding
Yi Yang
Mohan Kankanhalli
3DPC
153
112
0
27 May 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
217
1,189
0
23 Mar 2022
Masked Discrimination for Self-Supervised Learning on Point Clouds
Haotian Liu
Mu Cai
Yong Jae Lee
3DPC
56
168
0
21 Mar 2022
No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static Models by Fitting Feature-level Space-time Surfaces
Jianqi Zhong
Kaichen Zhou
Qingyong Hu
Bing Wang
Niki Trigoni
Andrew Markham
3DPC
73
22
0
21 Mar 2022
Masked Autoencoders for Point Cloud Self-supervised Learning
Yatian Pang
Wenxiao Wang
Francis E. H. Tay
Wen Liu
Yonghong Tian
Liuliang Yuan
3DPC
ViT
81
471
0
13 Mar 2022
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
141
668
0
16 Dec 2021
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
81
208
0
02 Dec 2021
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling
Xumin Yu
Lulu Tang
Yongming Rao
Tiejun Huang
Jie Zhou
Jiwen Lu
3DPC
118
675
0
29 Nov 2021
SimMIM: A Simple Framework for Masked Image Modeling
Zhenda Xie
Zheng Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
176
1,352
0
18 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
451
7,739
0
11 Nov 2021
Spatial-Temporal Transformer for 3D Point Cloud Sequences
Yimin Wei
Hao Liu
Tingting Xie
Qiuhong Ke
Yulan Guo
3DPC
ViT
AI4TS
42
37
0
19 Oct 2021
Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds
Siyuan Huang
Yichen Xie
Song-Chun Zhu
Yixin Zhu
3DPC
69
209
0
01 Sep 2021
Video Swin Transformer
Ze Liu
Jia Ning
Yue Cao
Yixuan Wei
Zheng Zhang
Stephen Lin
Han Hu
ViT
94
1,481
0
24 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
255
2,819
0
15 Jun 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
213
2,149
0
29 Mar 2021
Self-Supervised Pretraining of 3D Features on any Point-Cloud
Zaiwei Zhang
Rohit Girdhar
Armand Joulin
Ishan Misra
3DPC
161
272
0
07 Jan 2021
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts
Ji Hou
Benjamin Graham
Matthias Nießner
Saining Xie
3DPC
95
270
0
16 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
632
41,003
0
22 Oct 2020
PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
Saining Xie
Jiatao Gu
Demi Guo
C. Qi
Leonidas Guibas
Or Litany
3DPC
200
636
0
21 Jul 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
743
41,932
0
28 May 2020
3DV: 3D Dynamic Voxel for Action Recognition in Depth Video
Yancheng Wang
Yang Xiao
Fu Xiong
Wenxiang Jiang
Zhiguo Cao
Qiufeng Wang
Junsong Yuan
3DPC
48
83
0
12 May 2020
Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
Yongming Rao
Jiwen Lu
Jie Zhou
3DPC
52
131
0
29 Mar 2020
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
358
18,739
0
13 Feb 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
196
12,073
0
13 Nov 2019
MeteorNet: Deep Learning on Dynamic 3D Point Cloud Sequences
Xingyu Liu
Mengyuan Yan
Jeannette Bohg
3DPC
55
193
0
21 Oct 2019
4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks
Chris Choy
JunYoung Gwak
Silvio Savarese
3DPC
149
1,787
0
18 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,770
0
11 Oct 2018
FlowNet3D: Learning Scene Flow in 3D Point Clouds
Xingyu Liu
C. Qi
Leonidas Guibas
3DPC
85
480
0
04 Jun 2018
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
Amir Shahroudy
Jun Liu
T. Ng
G. Wang
240
2,489
0
11 Apr 2016
Rank Pooling for Action Recognition
Basura Fernando
E. Gavves
José Oramas
Amir Ghodrati
Tinne Tuytelaars
57
300
0
06 Dec 2015
Two-Stream Convolutional Networks for Action Recognition in Videos
Karen Simonyan
Andrew Zisserman
242
7,535
0
09 Jun 2014
1