Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.00719
Cited By
Video Transformer Network
1 February 2021
Daniel Neimark
Omri Bar
Maya Zohar
Dotan Asselmann
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Video Transformer Network"
50 / 100 papers shown
Title
F
3
^3
3
Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos
Zhaoyu Liu
Kan Jiang
Murong Ma
Zhé Hóu
Yun Lin
J. Dong
37
0
0
11 Apr 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
45
0
0
11 Feb 2025
Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLR
41
16
0
03 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Yunshan Zhong
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Rongrong Ji
Zhanpeng Zeng
Rongrong Ji
MQ
94
0
0
31 Dec 2024
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao
Gen Li
Shreyank N. Gowda
Robert B Fisher
Jonathan Huang
Anurag Arnab
Laura Sevilla-Lara
98
0
0
20 Nov 2024
On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection
Xiufeng Song
Xiao Guo
J. Zhang
Qirui Li
Lei Bai
Xiaoming Liu
Guangtao Zhai
Xiaohong Liu
DiffM
VGen
71
9
0
31 Oct 2024
Koala: Key frame-conditioned long video-LLM
Reuben Tan
Ximeng Sun
Ping Hu
Jui-hsien Wang
Hanieh Deilamsalehy
Bryan A. Plummer
Bryan C. Russell
Kate Saenko
38
35
0
05 Apr 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Zongxin Yang
Guikun Chen
Xiaodi Li
Wenguan Wang
Yi Yang
LM&Ro
LLMAG
69
35
0
16 Jan 2024
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Ziqiang Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffM
VGen
125
239
0
05 Jan 2024
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Tianlin Li
Yao Rong
Shiao Wang
Yuan Chen
Zhe Wu
Bowei Jiang
Yonghong Tian
Jin Tang
ViT
81
3
0
18 Dec 2023
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Arun V. Reddy
William Paul
Corban Rivera
Ketul Shah
Celso M. de Melo
Rama Chellappa
37
4
0
05 Dec 2023
Overcoming Label Noise for Source-free Unsupervised Video Domain Adaptation
A. Dasgupta
C. V. Jawahar
Karteek Alahari
TTA
VLM
18
10
0
30 Nov 2023
Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition
Jiaming Zhou
Hanjun Li
Kun-Yu Lin
Junwei Liang
26
1
0
28 Nov 2023
Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook
Ming Jin
Qingsong Wen
Yuxuan Liang
Chaoli Zhang
Siqiao Xue
...
Shirui Pan
Vincent S. Tseng
Yu Zheng
Lei Chen
Hui Xiong
AI4TS
SyDa
35
117
0
16 Oct 2023
RT-GAN: Recurrent Temporal GAN for Adding Lightweight Temporal Consistency to Frame-Based Domain Translation Approaches
Shawn Mathew
Saad Nadeem
Alvin C. Goh
Arie Kaufman
MedIm
49
0
0
02 Oct 2023
Masked Feature Modelling: Feature Masking for the Unsupervised Pre-training of a Graph Attention Network Block for Bottom-up Video Event Recognition
Dimitrios Daskalakis
Nikolaos Gkalelis
Vasileios Mezaris
36
0
0
24 Aug 2023
Towards Privacy-Supporting Fall Detection via Deep Unsupervised RGB2Depth Adaptation
Hejun Xiao
Kunyu Peng
Xiangsheng Huang
Alina Roitberg
Hao Li
Zhao Wang
Rainer Stiefelhagen
18
3
0
23 Aug 2023
What Can Simple Arithmetic Operations Do for Temporal Modeling?
Wenhao Wu
Yuxin Song
Zhun Sun
Jingdong Wang
Chang Xu
Wanli Ouyang
40
8
0
18 Jul 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim
Muzammal Naseer
Salman Khan
F. Khan
M. Shah
VLM
VPVLM
33
74
0
06 Apr 2023
On the Benefits of 3D Pose and Tracking for Human Action Recognition
Jathushan Rajasegaran
Georgios Pavlakos
Angjoo Kanazawa
Christoph Feichtenhofer
Jitendra Malik
36
30
0
03 Apr 2023
SVT: Supertoken Video Transformer for Efficient Video Understanding
Chen-Ming Pan
Rui Hou
Hanchao Yu
Qifan Wang
Senem Velipasalar
Madian Khabsa
ViT
21
0
0
01 Apr 2023
Multi-view knowledge distillation transformer for human action recognition
Yi Lin
Vincent S. Tseng
ViT
26
1
0
25 Mar 2023
Learning Spatial-Temporal Implicit Neural Representations for Event-Guided Video Super-Resolution
Yunfan Lu
Zipeng Wang
Minjie Liu
Hongjian Wang
Lin Wang
SupR
23
31
0
24 Mar 2023
Graph Transformer GANs for Graph-Constrained House Generation
H. Tang
Zhenyu Zhang
Humphrey Shi
Bo-wen Li
Lin Shao
N. Sebe
Radu Timofte
Luc Van Gool
46
19
0
14 Mar 2023
Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video Recognition
Junyan Wang
Zhenhong Sun
Yichen Qian
Dong Gong
Xiuyu Sun
Ming Lin
M. Pagnucco
Yang Song
3DPC
20
11
0
05 Mar 2023
GETNext: Trajectory Flow Map Enhanced Transformer for Next POI Recommendation
Song Yang
Jiamou Liu
Kaiqi Zhao
AI4TS
15
135
0
03 Mar 2023
ViTs for SITS: Vision Transformers for Satellite Image Time Series
Michail Tarasiou
Erik Chavez
S. Zafeiriou
ViT
11
48
0
12 Jan 2023
Fruit Ripeness Classification: a Survey
Matteo Rizzo
Matteo Marcuzzo
A. Zangari
A. Gasparetto
A. Albarelli
27
62
0
29 Dec 2022
A Survey on Human Action Recognition
Zhou Shuchang
29
0
0
20 Dec 2022
Cross-Modal Learning with 3D Deformable Attention for Action Recognition
Sangwon Kim
Dasom Ahn
ByoungChul Ko
ViT
3DPC
35
24
0
12 Dec 2022
Video Test-Time Adaptation for Action Recognition
Wei Lin
M. Jehanzeb Mirza
Mateusz Koziñski
Horst Possegger
Hilde Kuehne
Horst Bischof
TTA
47
31
0
24 Nov 2022
SVFormer: Semi-supervised Video Transformer for Action Recognition
Zhen Xing
Qi Dai
Hang-Rui Hu
Jingjing Chen
Zuxuan Wu
Yu-Gang Jiang
ViT
30
69
0
23 Nov 2022
Dynamic Appearance: A Video Representation for Action Recognition with Joint Training
Guoxi Huang
A. Bors
24
1
0
23 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
Limin Wang
Yu Qiao
ViT
30
107
0
17 Nov 2022
PatchBlender: A Motion Prior for Video Transformers
Gabriele Prato
Yale Song
Janarthanan Rajendran
R. Devon Hjelm
Neel Joshi
Sarath Chandar
ViT
27
0
0
11 Nov 2022
Focal and Global Spatial-Temporal Transformer for Skeleton-based Action Recognition
Zhimin Gao
Peitao Wang
Pei Lv
Xiaoheng Jiang
Qi-dong Liu
Pichao Wang
Mingliang Xu
Wanqing Li
ViT
52
27
0
06 Oct 2022
Leveraging Self-Supervised Training for Unintentional Action Recognition
Enea Duka
Anna Kukleva
Bernt Schiele
27
1
0
23 Sep 2022
PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers
Zhikai Li
Mengjuan Chen
Junrui Xiao
Qingyi Gu
ViT
MQ
43
33
0
13 Sep 2022
Time-distance vision transformers in lung cancer diagnosis from longitudinal computed tomography
Thomas Z. Li
Kaiwen Xu
Riqiang Gao
Yucheng Tang
Thomas A. Lasko
Fabien Maldonado
K. Sandler
Bennett A. Landman
ViT
MedIm
22
11
0
04 Sep 2022
A Novel Self-Knowledge Distillation Approach with Siamese Representation Learning for Action Recognition
Duc-Quang Vu
T. Phung
Jia-Ching Wang
24
9
0
03 Sep 2022
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
42
6
0
30 Aug 2022
Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLM
CLIP
ViT
28
313
0
04 Aug 2022
MAR: Masked Autoencoders for Efficient Action Recognition
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Xiang Wang
Yuehuang Wang
Yiliang Lv
Changxin Gao
Nong Sang
29
42
0
24 Jul 2022
Time Is MattEr: Temporal Self-supervision for Video Transformers
Sukmin Yun
Jaehyung Kim
Dongyoon Han
Hwanjun Song
Jung-Woo Ha
Jinwoo Shin
ViT
15
12
0
19 Jul 2022
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Zekun Li
Zhengyang Geng
Zhao Kang
Wenyu Chen
Yibo Yang
21
35
0
13 Jul 2022
Earthformer: Exploring Space-Time Transformers for Earth System Forecasting
Zhihan Gao
Xingjian Shi
Hao Wang
Yi Zhu
Yuyang Wang
Mu Li
Dit-Yan Yeung
AI4TS
39
149
0
12 Jul 2022
VidConv: A modernized 2D ConvNet for Efficient Video Recognition
Chuong H. Nguyen
Su Huynh
Vinh Nguyen
Ngoc-Khanh Nguyen
ViT
27
3
0
08 Jul 2022
Large-scale Robustness Analysis of Video Action Recognition Models
Madeline Chantry Schiappa
Naman Biyani
Prudvi Kamtam
Shruti Vyas
Hamid Palangi
Vibhav Vineet
Y. S. Rawat
AAML
34
24
0
04 Jul 2022
CTrGAN: Cycle Transformers GAN for Gait Transfer
Shahar Mahpod
Noam Gaash
Hay Hoffman
Gil Ben-Artzi
ViT
28
1
0
30 Jun 2022
1
2
Next