Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.06795
Cited By
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
13 December 2022
Chenhongyi Yang
Jiarui Xu
Shalini De Mello
Elliot J. Crowley
Xihuai Wang
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation"
19 / 19 papers shown
Title
MaxGlaViT: A novel lightweight vision transformer-based approach for early diagnosis of glaucoma stages from fundus images
Mustafa Yurdakul
Kubra Uyar
Şakir Taşdemir
58
1
0
24 Feb 2025
From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation
Yunfei Xie
Cihang Xie
Alan Yuille
Jieru Mei
OCL
46
0
0
02 Sep 2024
ESOD: Efficient Small Object Detection on High-Resolution Images
Kai-Chun Liu
Zhihang Fu
Sheng Jin
Ze Chen
Fan Zhou
Rongxin Jiang
Yao-Shen Chen
Jieping Ye
ObjD
41
2
0
23 Jul 2024
Paving the way toward foundation models for irregular and unaligned Satellite Image Time Series
Iris Dumeur
Silvia Valero
Jordi Inglada
37
3
0
11 Jul 2024
Training a high-performance retinal foundation model with half-the-data and 400 times less compute
Justin Engelmann
Miguel O. Bernabeu
MedIm
OOD
29
0
0
30 Apr 2024
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang
Zehui Chen
Miguel Espinosa
Linus Ericsson
Zhenyu Wang
Jiaming Liu
Elliot J. Crowley
Mamba
36
86
0
26 Mar 2024
xT: Nested Tokenization for Larger Context in Large Images
Ritwik Gupta
Shufan Li
Tyler Lixuan Zhu
Jitendra Malik
Trevor Darrell
K. Mangalam
ViT
32
4
0
04 Mar 2024
Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
32
74
0
14 Dec 2023
Revisiting the Encoding of Satellite Image Time Series
Xin Cai
Y. Bi
Peter Nicholl
Roy Sterritt
AI4TS
33
3
0
03 May 2023
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xihuai Wang
ViT
VLM
189
499
0
22 Feb 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
351
633
0
13 Jul 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
271
2,603
0
04 May 2021
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
120
209
0
26 Apr 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
284
1,524
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
277
3,623
0
24 Feb 2021
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi
Yin Cui
A. Srinivas
Rui Qian
Tsung-Yi Lin
E. D. Cubuk
Quoc V. Le
Barret Zoph
ISeg
249
968
0
13 Dec 2020
Deep High-Resolution Representation Learning for Visual Recognition
Jingdong Wang
Ke Sun
Tianheng Cheng
Borui Jiang
Chaorui Deng
...
Yadong Mu
Mingkui Tan
Xinggang Wang
Wenyu Liu
Bin Xiao
195
3,531
0
20 Aug 2019
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
297
10,220
0
16 Nov 2016
1