Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.18361
Cited By
ViTAR: Vision Transformer with Any Resolution
27 March 2024
Qihang Fan
Quanzeng You
Xiaotian Han
Yongfei Liu
Yunzhe Tao
Huaibo Huang
Ran He
Hongxia Yang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ViTAR: Vision Transformer with Any Resolution"
12 / 12 papers shown
Title
UniViTAR: Unified Vision Transformer with Native Resolution
Limeng Qiao
Yiyang Gan
Bairui Wang
Jie Qin
Shuang Xu
Siqi Yang
Lin Ma
57
0
0
02 Apr 2025
FlexiMo: A Flexible Remote Sensing Foundation Model
Xuyang Li
Chenyu Li
Pedram Ghamisi
Danfeng Hong
40
0
0
31 Mar 2025
PathoHR: Breast Cancer Survival Prediction on High-Resolution Pathological Images
Y. Luo
Shiru Wang
Jiaheng Liu
Jiaxuan Xiao
Rundong Xue
Zeyu Zhang
Hao Zhang
Yu Lu
Yang Zhao
Yutong Xie
42
0
0
23 Mar 2025
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
Zihao Zhang
Haoran Chen
Haoyu Zhao
Guansong Lu
Yanwei Fu
Hang Xu
Zuxuan Wu
VGen
DiffM
71
0
0
20 Mar 2025
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
Zuyan Liu
Yuhao Dong
Ziwei Liu
Winston Hu
Jiwen Lu
Yongming Rao
ObjD
86
54
0
19 Sep 2024
SDformerFlow: Spatiotemporal swin spikeformer for event-based optical flow estimation
Yi Tian
Juan Andrade-Cetto
32
0
0
06 Sep 2024
Cross-Domain Foundation Model Adaptation: Pioneering Computer Vision Models for Geophysical Data Analysis
Zhixiang Guo
Xinming Wu
Luming Liang
Hanlin Sheng
Nuo Chen
Zhengfa Bi
AI4CE
51
1
0
22 Aug 2024
Liveness Detection in Computer Vision: Transformer-based Self-Supervised Learning for Face Anti-Spoofing
Arman Keresh
Pakizar Shamoi
46
5
0
19 Jun 2024
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
351
633
0
13 Jul 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
283
3,623
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
304
3,708
0
11 Feb 2021
1