Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.13920
Cited By
DuoFormer: Leveraging Hierarchical Visual Representations by Local and Global Attention
18 July 2024
Xiaoya Tang
Bodong Zhang
Beatrice Knudsen
Tolga Tasdizen
ViT
MedIm
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DuoFormer: Leveraging Hierarchical Visual Representations by Local and Global Attention"
12 / 12 papers shown
Title
CLASS-M: Adaptive stain separation-based contrastive learning with pseudo-labeling for histopathological image classification
Bodong Zhang
Hamid Manoochehri
M. M. Ho
Fahimeh Fooladgar
Yosep Chong
Beatrice Knudsen
Deepika Sirohi
Tolga Tasdizen
69
4
0
12 Dec 2023
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
98
68
0
17 Jul 2023
Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition
Qibin Hou
Cheng Lu
Mingg-Ming Cheng
Jiashi Feng
ViT
67
131
0
22 Nov 2022
MetaFormer Is Actually What You Need for Vision
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
130
896
0
22 Nov 2021
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
118
1,248
0
22 Apr 2021
Rethinking Spatial Dimensions of Vision Transformers
Byeongho Heo
Sangdoo Yun
Dongyoon Han
Sanghyuk Chun
Junsuk Choe
Seong Joon Oh
ViT
475
573
0
30 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
55
1,450
0
27 Mar 2021
Incorporating Convolution Designs into Visual Transformers
Kun Yuan
Shaopeng Guo
Ziwei Liu
Aojun Zhou
F. Yu
Wei Wu
ViT
86
472
0
22 Mar 2021
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
ViT
92
818
0
19 Mar 2021
UNETR: Transformers for 3D Medical Image Segmentation
Ali Hatamizadeh
Yucheng Tang
Vishwesh Nath
Dong Yang
Andriy Myronenko
Bennett Landman
H. Roth
Daguang Xu
ViT
MedIm
145
1,567
0
18 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
450
3,678
0
24 Feb 2021
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
303
6,657
0
23 Dec 2020
1