ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.09245
  4. Cited By
Masked Spatio-Temporal Structure Prediction for Self-supervised Learning
  on Point Cloud Videos

Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos

18 August 2023
Zhiqiang Shen
Xiaoxiao Sheng
Hehe Fan
Longguang Wang
Y. Guo
Qiong Liu
Hao Wen
Xiaoping Zhou
    3DPC
ArXivPDFHTML

Papers citing "Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos"

38 / 38 papers shown
Title
Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos
Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos
Zhi Zuo
Chenyi Zhuang
Zhiqiang Shen
Pan Gao
Jie Qin
Nicu Sebe
3DPC
108
0
0
07 Apr 2025
Masked Image Modeling: A Survey
Masked Image Modeling: A Survey
Vlad Hondru
Florinel-Alin Croitoru
Shervin Minaee
Radu Tudor Ionescu
N. Sebe
113
8
0
13 Aug 2024
Contrastive Predictive Autoencoders for Dynamic Point Cloud
  Self-Supervised Learning
Contrastive Predictive Autoencoders for Dynamic Point Cloud Self-Supervised Learning
Xiaoxiao Sheng
Zhiqiang Shen
Gang Xiao
3DPC
SSL
43
6
0
22 May 2023
PointCMP: Contrastive Mask Prediction for Self-supervised Learning on
  Point Cloud Videos
PointCMP: Contrastive Mask Prediction for Self-supervised Learning on Point Cloud Videos
Zhiqiang Shen
Xiaoxiao Sheng
Longguang Wang
Y. Guo
Qiong Liu
Xiaoping Zhou
3DPC
SSL
62
14
0
06 May 2023
Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud
  Sequence Representation Learning
Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning
Zhuoyang Zhang
Yu Dong
Yunze Liu
Li Yi
3DPC
AI4TS
53
19
0
10 Dec 2022
Point Primitive Transformer for Long-Term 4D Point Cloud Video
  Understanding
Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding
Hao Wen
Yunze Liu
Jingwei Huang
Bokun Duan
Li Yi
ViT
3DPC
75
27
0
30 Jul 2022
Static and Dynamic Concepts for Self-supervised Video Representation
  Learning
Static and Dynamic Concepts for Self-supervised Video Representation Learning
Rui Qian
Shuangrui Ding
Xian Liu
Dahua Lin
SSL
43
25
0
26 Jul 2022
PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences
PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences
Hehe Fan
Xin Yu
Yuhang Ding
Yi Yang
Mohan Kankanhalli
3DPC
153
112
0
27 May 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for
  Self-Supervised Video Pre-Training
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
217
1,189
0
23 Mar 2022
Masked Discrimination for Self-Supervised Learning on Point Clouds
Masked Discrimination for Self-Supervised Learning on Point Clouds
Haotian Liu
Mu Cai
Yong Jae Lee
3DPC
56
168
0
21 Mar 2022
No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static
  Models by Fitting Feature-level Space-time Surfaces
No Pain, Big Gain: Classify Dynamic Point Cloud Sequences with Static Models by Fitting Feature-level Space-time Surfaces
Jianqi Zhong
Kaichen Zhou
Qingyong Hu
Bing Wang
Niki Trigoni
Andrew Markham
3DPC
73
22
0
21 Mar 2022
Masked Autoencoders for Point Cloud Self-supervised Learning
Masked Autoencoders for Point Cloud Self-supervised Learning
Yatian Pang
Wenxiao Wang
Francis E. H. Tay
Wen Liu
Yonghong Tian
Liuliang Yuan
3DPC
ViT
81
471
0
13 Mar 2022
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
141
668
0
16 Dec 2021
BEVT: BERT Pretraining of Video Transformers
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
81
208
0
02 Dec 2021
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point
  Modeling
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling
Xumin Yu
Lulu Tang
Yongming Rao
Tiejun Huang
Jie Zhou
Jiwen Lu
3DPC
118
675
0
29 Nov 2021
SimMIM: A Simple Framework for Masked Image Modeling
SimMIM: A Simple Framework for Masked Image Modeling
Zhenda Xie
Zheng Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
176
1,352
0
18 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
451
7,739
0
11 Nov 2021
Spatial-Temporal Transformer for 3D Point Cloud Sequences
Spatial-Temporal Transformer for 3D Point Cloud Sequences
Yimin Wei
Hao Liu
Tingting Xie
Qiuhong Ke
Yulan Guo
3DPC
ViT
AI4TS
42
37
0
19 Oct 2021
Spatio-temporal Self-Supervised Representation Learning for 3D Point
  Clouds
Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds
Siyuan Huang
Yichen Xie
Song-Chun Zhu
Yixin Zhu
3DPC
69
209
0
01 Sep 2021
Video Swin Transformer
Video Swin Transformer
Ze Liu
Jia Ning
Yue Cao
Yixuan Wei
Zheng Zhang
Stephen Lin
Han Hu
ViT
94
1,481
0
24 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
255
2,819
0
15 Jun 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
213
2,149
0
29 Mar 2021
Self-Supervised Pretraining of 3D Features on any Point-Cloud
Self-Supervised Pretraining of 3D Features on any Point-Cloud
Zaiwei Zhang
Rohit Girdhar
Armand Joulin
Ishan Misra
3DPC
161
272
0
07 Jan 2021
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene
  Contexts
Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts
Ji Hou
Benjamin Graham
Matthias Nießner
Saining Xie
3DPC
95
270
0
16 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
632
41,003
0
22 Oct 2020
PointContrast: Unsupervised Pre-training for 3D Point Cloud
  Understanding
PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
Saining Xie
Jiatao Gu
Demi Guo
C. Qi
Leonidas Guibas
Or Litany
3DPC
200
636
0
21 Jul 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
743
41,932
0
28 May 2020
3DV: 3D Dynamic Voxel for Action Recognition in Depth Video
3DV: 3D Dynamic Voxel for Action Recognition in Depth Video
Yancheng Wang
Yang Xiao
Fu Xiong
Wenxiang Jiang
Zhiguo Cao
Qiufeng Wang
Junsong Yuan
3DPC
48
83
0
12 May 2020
Global-Local Bidirectional Reasoning for Unsupervised Representation
  Learning of 3D Point Clouds
Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
Yongming Rao
Jiwen Lu
Jie Zhou
3DPC
52
131
0
29 Mar 2020
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
358
18,739
0
13 Feb 2020
Momentum Contrast for Unsupervised Visual Representation Learning
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
196
12,073
0
13 Nov 2019
MeteorNet: Deep Learning on Dynamic 3D Point Cloud Sequences
MeteorNet: Deep Learning on Dynamic 3D Point Cloud Sequences
Xingyu Liu
Mengyuan Yan
Jeannette Bohg
3DPC
55
193
0
21 Oct 2019
4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks
4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks
Chris Choy
JunYoung Gwak
Silvio Savarese
3DPC
149
1,787
0
18 Apr 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,770
0
11 Oct 2018
FlowNet3D: Learning Scene Flow in 3D Point Clouds
FlowNet3D: Learning Scene Flow in 3D Point Clouds
Xingyu Liu
C. Qi
Leonidas Guibas
3DPC
85
480
0
04 Jun 2018
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis
Amir Shahroudy
Jun Liu
T. Ng
G. Wang
240
2,489
0
11 Apr 2016
Rank Pooling for Action Recognition
Rank Pooling for Action Recognition
Basura Fernando
E. Gavves
José Oramas
Amir Ghodrati
Tinne Tuytelaars
57
300
0
06 Dec 2015
Two-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in Videos
Karen Simonyan
Andrew Zisserman
242
7,535
0
09 Jun 2014
1