Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.13222
Cited By
SVFormer: Semi-supervised Video Transformer for Action Recognition
23 November 2022
Zhen Xing
Qi Dai
Hang-Rui Hu
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SVFormer: Semi-supervised Video Transformer for Action Recognition"
50 / 50 papers shown
Title
RadarLLM: Empowering Large Language Models to Understand Human Motion from Millimeter-wave Point Cloud Sequence
Zengyuan Lai
Jiarui Yang
Songpengcheng Xia
Lizhou Lin
Lan Sun
Renwen Wang
J. Liu
Qi Wu
Ling Pei
38
0
0
14 Apr 2025
Hierarchical Relation-augmented Representation Generalization for Few-shot Action Recognition
Hongyu Qu
Ling Xing
Rui Yan
Yazhou Yao
G. Xie
Xiangbo Shu
29
0
0
14 Apr 2025
FLAMES: A Hybrid Spiking-State Space Model for Adaptive Memory Retention in Event-Based Learning
Biswadeep Chakraborty
Saibal Mukhopadhyay
47
0
0
02 Apr 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
199
0
0
26 Mar 2025
Temporal Regularization Makes Your Video Generator Stronger
Harold Haodong Chen
Haojian Huang
Xianfeng Wu
Yexin Liu
Yajing Bai
Wen-Jie Shu
Harry Yang
Ser-Nam Lim
VGen
79
2
0
19 Mar 2025
VideoScan: Enabling Efficient Streaming Video Understanding via Frame-level Semantic Carriers
Ruanjun Li
Yuedong Tan
Yuanming Shi
Jiawei Shao
VLM
125
0
0
12 Mar 2025
Semi-Supervised Audio-Visual Video Action Recognition with Audio Source Localization Guided Mixup
Seokun Kang
Taehwan Kim
42
0
0
04 Mar 2025
Dual Invariance Self-training for Reliable Semi-supervised Surgical Phase Recognition
Sahar Nasirihaghighi
Negin Ghamsarian
Raphael Sznitman
Klaus Schoeffmann
37
1
0
29 Jan 2025
Bridging the Gaps: Utilizing Unlabeled Face Recognition Datasets to Boost Semi-Supervised Facial Expression Recognition
Jie Song
Mengqiao He
Jinhua Feng
B. S.
22
0
0
23 Oct 2024
Cognition Transferring and Decoupling for Text-supervised Egocentric Semantic Segmentation
Zhaofeng Shi
Heqian Qiu
Lanxiao Wang
Fanman Meng
Q. Wu
Hongliang Li
30
2
0
02 Oct 2024
FinePseudo: Improving Pseudo-Labelling through Temporal-Alignablity for Semi-Supervised Fine-Grained Action Recognition
Ishan Rajendrakumar Dave
Mamshad Nayeem Rizve
Mubarak Shah
AI4TS
23
2
0
02 Sep 2024
Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms
Xiao Wang
Shiao Wang
Pengpeng Shao
Bo Jiang
Lin Zhu
Yonghong Tian
107
2
0
19 Aug 2024
MPT-PAR:Mix-Parameters Transformer for Panoramic Activity Recognition
Wenqing Gan
Yaoyu Li
Jian Li
Zhangang Lin
ViT
30
0
0
01 Aug 2024
Towards Adaptive Pseudo-label Learning for Semi-Supervised Temporal Action Localization
Feixiang Zhou
Bryan M. Williams
Hossein Rahmani
40
1
0
10 Jul 2024
PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video Recognition
Y. Hao
Diansong Zhou
Zhicai Wang
Chong-Wah Ngo
Meng Wang
ViT
32
4
0
03 Jul 2024
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
Zhen Xing
Qi Dai
Zejia Weng
Zuxuan Wu
Yu-Gang Jiang
VGen
43
14
0
10 Jun 2024
Video-Language Understanding: A Survey from Model Architecture, Model Training, and Data Perspectives
Thong Nguyen
Yi Bin
Junbin Xiao
Leigang Qu
Yicong Li
Jay Zhangjie Wu
Cong-Duy Nguyen
See-Kiong Ng
Luu Anh Tuan
VLM
43
9
1
09 Jun 2024
RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model
Khaled Alomar
Halil Ibrahim Aysel
Xiaohao Cai
MedIm
ViT
37
7
0
02 Jun 2024
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
Shuangpeng Han
Ziyu Wang
Mengmi Zhang
26
0
0
26 May 2024
Wearable-based behaviour interpolation for semi-supervised human activity recognition
Haoran Duan
Shidong Wang
Varun Ojha
Shizheng Wang
Yawen Huang
Yang Long
R. Ranjan
Yefeng Zheng
HAI
15
4
0
24 May 2024
PitcherNet: Powering the Moneyball Evolution in Baseball Video Analytics
Jerrin Bright
Bavesh Balaji
Yuhao Chen
David A Clausi
John S. Zelek
24
0
0
13 May 2024
Learning Discriminative Spatio-temporal Representations for Semi-supervised Action Recognition
Yu Wang
Sanpin Zhou
Kun Xia
Le Wang
31
0
0
25 Apr 2024
ViTAR: Vision Transformer with Any Resolution
Qihang Fan
Quanzeng You
Xiaotian Han
Yongfei Liu
Yunzhe Tao
Huaibo Huang
Ran He
Hongxia Yang
ViT
37
14
0
27 Mar 2024
vid-TLDR: Training Free Token merging for Light-weight Video Transformer
Joonmyung Choi
Sanghyeok Lee
Jaewon Chu
Minhyuk Choi
Hyunwoo J. Kim
MoMe
ViT
44
12
0
20 Mar 2024
Learning Causal Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition
Yuke Li
Guangyi Chen
Ben Abramowitz
Stefano Anzellotti
Donglai Wei
TTA
40
1
0
20 Feb 2024
Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment
Lei Wang
Jun Liu
Liang Zheng
Tom Gedeon
Piotr Koniusz
25
9
0
07 Feb 2024
Computer Vision for Primate Behavior Analysis in the Wild
Richard Vogg
Timo Lüddecke
Jonathan Henrich
Sharmita Dey
Matthias Nuske
...
Alexander Gail
Stefan Treue
H. Scherberger
F. Worgotter
Alexander S. Ecker
28
3
0
29 Jan 2024
SignVTCL: Multi-Modal Continuous Sign Language Recognition Enhanced by Visual-Textual Contrastive Learning
Hao Chen
Jiaze Wang
Ziyu Guo
Jinpeng Li
Donghao Zhou
Bian Wu
Chenyong Guan
Guangyong Chen
Pheng-Ann Heng
25
5
0
22 Jan 2024
Roll With the Punches: Expansion and Shrinkage of Soft Label Selection for Semi-supervised Fine-Grained Learning
Yue Duan
Zhen Zhao
Lei Qi
Luping Zhou
Lei Wang
Yinghuan Shi
30
4
0
19 Dec 2023
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
Zhen Xing
Qi Dai
Zihao Zhang
Hui Zhang
Hang-Rui Hu
Zuxuan Wu
Yu-Gang Jiang
VGen
50
17
0
30 Nov 2023
MotionEditor: Editing Video Motion via Content-Aware Diffusion
Shuyuan Tu
Qi Dai
Zhi-Qi Cheng
Hang-Rui Hu
Xintong Han
Zuxuan Wu
Yu-Gang Jiang
DiffM
VGen
28
30
0
30 Nov 2023
A Survey on Video Diffusion Models
Zhen Xing
Qijun Feng
Haoran Chen
Qi Dai
Hang-Rui Hu
Hang Xu
Zuxuan Wu
Yu-Gang Jiang
EGVM
VGen
57
116
0
16 Oct 2023
Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data
Zuxuan Wu
Zejia Weng
Wujian Peng
Xitong Yang
Ang Li
Larry S. Davis
Yu-Gang Jiang
CLIP
VLM
33
21
0
08 Oct 2023
XVO: Generalized Visual Odometry via Cross-Modal Self-Training
Tohida Rehman
Ronit Mandal
Jimuyang Zhang
Debarshi Kumar Sanyal
SSL
33
17
0
28 Sep 2023
PanoSwin: a Pano-style Swin Transformer for Panorama Understanding
Zhixin Ling
Zhen Xing
Xiangdong Zhou
Manliang Cao
G. Zhou
ViT
26
17
0
28 Aug 2023
SimDA: Simple Diffusion Adapter for Efficient Video Generation
Zhen Xing
Qi Dai
Hang-Rui Hu
Zuxuan Wu
Yu-Gang Jiang
VGen
DiffM
29
81
0
18 Aug 2023
LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning
Jifan Zhang
Yifang Chen
Gregory H. Canal
Stephen Mussmann
Arnav M. Das
...
Yinglun Zhu
Jeffrey Bilmes
S. Du
Kevin G. Jamieson
Robert D. Nowak
VLM
33
10
0
16 Jun 2023
Cross-view Action Recognition Understanding From Exocentric to Egocentric Perspective
Thanh-Dat Truong
Khoa Luu
EgoV
27
10
0
25 May 2023
Transfer Learning for Fine-grained Classification Using Semi-supervised Learning and Visual Transformers
Manuel Lagunas
Brayan Impata
Victor Martinez
Virginia Fernandez
Christos Georgakis
Sofia Braun
Felipe Bertrand
ViT
25
8
0
17 May 2023
Implicit Temporal Modeling with Learnable Alignment for Video Recognition
S. Tu
Qi Dai
Zuxuan Wu
Zhi-Qi Cheng
Hang-Rui Hu
Yu-Gang Jiang
30
35
0
20 Apr 2023
CLIP-TSA: CLIP-Assisted Temporal Self-Attention for Weakly-Supervised Video Anomaly Detection
Kevin Hyekang Joo
Khoa T. Vo
Kashu Yamazaki
Ngan Le
19
38
0
09 Dec 2022
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Lu Yuan
Yu-Gang Jiang
VGen
29
87
0
08 Dec 2022
Prototypical Residual Networks for Anomaly Detection and Localization
H. Zhang
Zuxuan Wu
Z. Wang
Zhineng Chen
Yuwei Jiang
UQCV
AI4TS
35
62
0
05 Dec 2022
ResFormer: Scaling ViTs with Multi-Resolution Training
Rui Tian
Zuxuan Wu
Qiuju Dai
Hang-Rui Hu
Yu Qiao
Yu-Gang Jiang
ViT
19
31
0
01 Dec 2022
On the Surprising Effectiveness of Transformers in Low-Labeled Video Recognition
Farrukh Rahman
Ömer Mubarek
Z. Kira
ViT
10
2
0
15 Sep 2022
ViT-ReT: Vision and Recurrent Transformer Neural Networks for Human Activity Recognition in Videos
James Wensel
Hayat Ullah
Arslan Munir
ViT
16
42
0
16 Aug 2022
Human Activity Recognition Using Cascaded Dual Attention CNN and Bi-Directional GRU Framework
Hayat Ullah
Arslan Munir
HAI
19
27
0
09 Aug 2022
Semi-Supervised Semantic Segmentation with Pixel-Level Contrastive Learning from a Class-wise Memory Bank
Inigo Alonso
Alberto Sabater
David Ferstl
Luis Montesano
Ana C. Murillo
SSL
CLL
121
203
0
27 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,981
0
09 Feb 2021
Video Transformer Network
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
198
421
0
01 Feb 2021
1