Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.10270
Cited By
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
18 June 2021
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers"
50 / 415 papers shown
Title
Vote&Mix: Plug-and-Play Token Reduction for Efficient Vision Transformer
Shuai Peng
Di Fu
Baole Wei
Yong Cao
Liangcai Gao
Zhi Tang
ViT
45
1
0
30 Aug 2024
Symmetric masking strategy enhances the performance of Masked Image Modeling
Khanh-Binh Nguyen
Chae Jung Park
34
0
0
23 Aug 2024
Supervised Representation Learning towards Generalizable Assembly State Recognition
Tim J. Schoonbeek
Goutham Balachandran
H. Onvlee
Tim Houben
Shao-Hsuan Hung
Jacek Kustra
Peter H. N. de With
Fons van der Sommen
42
1
0
21 Aug 2024
Focus on Focus: Focus-oriented Representation Learning and Multi-view Cross-modal Alignment for Glioma Grading
Li Pan
Yupei Zhang
Qiushi Yang
Tan Li
Xiaohan Xing
Maximus C. F. Yeung
Zhen Chen
45
1
0
16 Aug 2024
Beyond Uniform Query Distribution: Key-Driven Grouped Query Attention
Zohaib Khan
Muhammad Khaquan
Omer Tafveez
Burhanuddin Samiwala
Agha Ali Raza
38
3
0
15 Aug 2024
Downstream Transfer Attack: Adversarial Attacks on Downstream Models with Pre-trained Vision Transformers
Weijie Zheng
Xingjun Ma
Hanxun Huang
Zuxuan Wu
Yu-Gang Jiang
AAML
37
0
0
03 Aug 2024
Privacy-Preserving Split Learning with Vision Transformers using Patch-Wise Random and Noisy CutMix
Yang Jin
Sihun Baek
Lei Zhang
Hyelin Nam
Praneeth Vepakomma
Ramesh Raskar
Mehdi Bennis
Seong-Lyun Kim
36
2
0
02 Aug 2024
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Richard Ren
Steven Basart
Adam Khoja
Alice Gatti
Long Phan
...
Alexander Pan
Gabriel Mukobi
Ryan H. Kim
Stephen Fitz
Dan Hendrycks
ELM
26
21
0
31 Jul 2024
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Gagan Jain
Nidhi Hegde
Aditya Kusupati
Arsha Nagrani
Shyamal Buch
Prateek Jain
Anurag Arnab
Sujoy Paul
MoE
48
7
0
29 Jul 2024
Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets
Tianxiao Zhang
Wenju Xu
Bo Luo
Guanghui Wang
ViT
MDE
40
7
0
28 Jul 2024
A Survey on Cell Nuclei Instance Segmentation and Classification: Leveraging Context and Attention
João D. Nunes
D. Montezuma
Domingos Oliveira
Tania Pereira
Jaime S. Cardoso
49
1
0
26 Jul 2024
Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU Patient Monitoring
Mario Francisco Munoz
Hoang Vu Huy
Thanh-Dung Le
42
1
0
18 Jul 2024
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Large Models
Chen Ju
Haicheng Wang
Haozhe Cheng
Xu Chen
Zhonghua Zhai
Weilin Huang
Jinsong Lan
Shuai Xiao
Bo Zheng
VLM
49
5
0
16 Jul 2024
Adaptive Parametric Activation
Konstantinos Panagiotis Alexandridis
Jiankang Deng
Anh Nguyen
Shan Luo
41
2
0
11 Jul 2024
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Kazi Sajeed Mehrab
M. Maruf
Arka Daw
Harish Babu Manogaran
Abhilash Neog
...
Paula Mabee
Wasila Dahdul
Anuj Karpatne
Wasila M Dahdul
Anuj Karpatne
41
4
0
10 Jul 2024
CTRL-F: Pairing Convolution with Transformer for Image Classification via Multi-Level Feature Cross-Attention and Representation Learning Fusion
Hosam S. El-Assiouti
Hadeer El-Saadawy
M. Al-Berry
M. Tolba
ViT
52
0
0
09 Jul 2024
Image-Conditional Diffusion Transformer for Underwater Image Enhancement
Xingyang Nie
Su Pan
Xiaoyu Zhai
Shifei Tao
Fengzhong Qu
Biao Wang
Huilin Ge
Guojie Xiao
34
2
0
07 Jul 2024
Precision at Scale: Domain-Specific Datasets On-Demand
Jesús M. Rodríguez-de-Vera
Imanol G. Estepa
Ignacio Sarasúa
Bhalaji Nagarajan
P. Radeva
36
2
0
03 Jul 2024
PathAlign: A vision-language model for whole slide images in histopathology
Faruk Ahmed
Andrew Sellergren
Lin Yang
Shawn Xu
Boris Babenko
...
S. Shetty
Daniel Golden
Yun-hui Liu
David F. Steiner
Ellery Wulczyn
LM&MA
VLM
36
15
0
27 Jun 2024
Towards Efficient and Scalable Training of Differentially Private Deep Learning
Sebastian Rodriguez Beltran
Marlon Tobaben
Niki Loppi
Antti Honkela
34
0
0
25 Jun 2024
A Simple Framework for Open-Vocabulary Zero-Shot Segmentation
Thomas Stegmüller
Tim Lebailly
Nikola Dukic
Behzad Bozorgtabar
Tinne Tuytelaars
Jean-Philippe Thiran
VLM
39
1
0
23 Jun 2024
Potion: Towards Poison Unlearning
Stefan Schoepf
Jack Foster
Alexandra Brintrup
AAML
MU
49
7
0
13 Jun 2024
UDON: Universal Dynamic Online distillatioN for generic image representations
Nikolaos-Antonios Ypsilantis
Kaifeng Chen
André Araujo
Ondřej Chum
43
3
0
12 Jun 2024
Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection
Wenxiao Wang
Weiming Zhuang
Lingjuan Lyu
44
0
0
11 Jun 2024
Adapters Strike Back
Jan-Martin O. Steitz
Stefan Roth
27
5
0
10 Jun 2024
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
Dongyoon Hwang
ByungKun Lee
Hojoon Lee
Hyunseung Kim
Jaegul Choo
53
0
0
10 Jun 2024
Nomic Embed Vision: Expanding the Latent Space
Zach Nussbaum
Brandon Duderstadt
Andriy Mulyar
VLM
33
5
0
06 Jun 2024
Parameter-Inverted Image Pyramid Networks
Xizhou Zhu
Xue Yang
Zhaokai Wang
Hao Li
Wenhan Dou
Junqi Ge
Lewei Lu
Yu Qiao
Jifeng Dai
47
0
0
06 Jun 2024
M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating Interferometric SAR and RGB Data
Matthew J Allen
Francisco Dorr
Joseph A. Gallego-Mejia
Laura Martínez-Ferrer
Anna Jungbluth
Freddie Kalaitzis
Raúl Ramos-Pollán
33
3
0
06 Jun 2024
Reassessing How to Compare and Improve the Calibration of Machine Learning Models
M. Chidambaram
Rong Ge
74
1
0
06 Jun 2024
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
Kang You
Zekai Xu
Chen Nie
Zhijie Deng
Qinghai Guo
Xiang Wang
Zhezhi He
40
10
0
05 Jun 2024
Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need
Martin Wistuba
Prabhu Teja Sivaprasad
Lukas Balles
Giovanni Zappella
32
0
0
05 Jun 2024
On the Nonlinearity of Layer Normalization
Yunhao Ni
Yuxin Guo
Junlong Jia
Lei Huang
39
4
0
03 Jun 2024
Searching for internal symbols underlying deep learning
J. H. Lee
Sujith Vijayan
AI4CE
35
0
0
31 May 2024
Improving Object Detector Training on Synthetic Data by Starting With a Strong Baseline Methodology
Frank Ruis
Alma M. Liezenga
Friso G. Heslinga
Luca Ballan
Thijs A. Eker
Richard J. M. den Hollander
Martin C. van Leeuwen
Judith Dijk
Wyke Huizinga
38
4
0
30 May 2024
Wavelet-Based Image Tokenizer for Vision Transformers
Zhenhai Zhu
Radu Soricut
ViT
50
3
0
28 May 2024
MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
Wenzhuo Liu
Fei Zhu
Shijie Ma
Cheng-Lin Liu
30
4
0
28 May 2024
Activator: GLU Activation Function as the Core Component of a Vision Transformer
Abdullah Nazhat Abdullah
Tarkan Aydin
ViT
43
0
0
24 May 2024
Configuring Data Augmentations to Reduce Variance Shift in Positional Embedding of Vision Transformers
Bum Jun Kim
Sang Woo Kim
ViT
43
1
0
23 May 2024
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
52
2
0
22 May 2024
Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data
Tarun Kalluri
Jihyeon Janel Lee
Kihyuk Sohn
Sahil Singla
Manmohan Chandraker
Joseph Z. Xu
Jeremiah Liu
49
1
0
22 May 2024
Audio Mamba: Pretrained Audio State Space Model For Audio Tagging
Jiaju Lin
Haoxuan Hu
Mamba
39
7
0
22 May 2024
How to train your ViT for OOD Detection
Maximilian Mueller
Matthias Hein
18
0
0
21 May 2024
Quantum Vision Transformers for Quark-Gluon Classification
Marçal Comajoan Cara
Gopal Ramesh Dahale
Zhongtian Dong
Roy T. Forestano
S. Gleyzer
...
Kyoungchul Kong
Tom Magorsch
Konstantin T. Matchev
Katia Matcheva
Eyup B. Unlu
48
9
0
16 May 2024
Understanding Hyperbolic Metric Learning through Hard Negative Sampling
Yun Yue
Fangzhou Lin
Guanyi Mou
Ziming Zhang
SSL
30
1
0
23 Apr 2024
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao
Shubo Lin
Shaoru Wang
Yutong Kou
Zeming Li
Liang Li
Congxuan Zhang
Xiaoqin Zhang
Yizheng Wang
Weiming Hu
47
1
0
18 Apr 2024
Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology
Oren Z. Kraus
Kian Kenyon-Dean
Saber Saberian
Maryam Fallah
Peter McLean
...
Chi Vicky Cheng
Kristen Morse
Maureen Makes
Ben Mabey
Berton A. Earnshaw
37
26
0
16 Apr 2024
Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani
Amit Raj
Kevis-Kokitsi Maninis
Abhishek Kar
Yuanzhen Li
Michael Rubinstein
Deqing Sun
Leonidas J. Guibas
Justin Johnson
Varun Jampani
40
79
0
12 Apr 2024
Struggle with Adversarial Defense? Try Diffusion
Yujie Li
Yanbin Wang
Haitao Xu
Bin Liu
Jianguo Sun
Zhenhao Guo
Wenrui Ma
DiffM
32
1
0
12 Apr 2024
HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion
Jiahang Li
Peng Yun
Qijun Chen
Rui Fan
36
8
0
04 Apr 2024
Previous
1
2
3
4
5
6
7
8
9
Next