Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.10231
Cited By
A Model Zoo of Vision Transformers
14 April 2025
Damian Falk
Léo Meynent
Florence Pfammatter
Konstantin Schurholt
Damian Borth
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Model Zoo of Vision Transformers"
50 / 56 papers shown
Title
Structure Is Not Enough: Leveraging Behavior for Neural Network Weight Reconstruction
Léo Meynent
Ivan Melev
Konstantin Schurholt
Göran Kauermann
Damian Borth
101
3
0
21 Mar 2025
Local and Global Decoding in Text Generation
Daniel Gareev
Thomas Hofmann
Ezhilmathi Krishnasamy
Tiago Pimentel
65
4
0
14 Oct 2024
Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models
Theo Putterman
Derek Lim
Yoav Gelberg
Stefanie Jegelka
Haggai Maron
AI4CE
70
6
0
05 Oct 2024
MD tree: a model-diagnostic tree grown on loss landscape
Yefan Zhou
Jianlong Chen
Qinxue Cao
Konstantin Schürholt
Yaoqing Yang
68
2
0
24 Jun 2024
Neural Lineage
Runpeng Yu
Xinchao Wang
84
4
0
17 Jun 2024
Towards Scalable and Versatile Weight Space Learning
Konstantin Schurholt
Michael W. Mahoney
Damian Borth
82
17
0
14 Jun 2024
Graph Neural Networks for Learning Equivariant Representations of Neural Networks
Miltiadis Kofinas
Boris Knyazev
Yan Zhang
Yunlu Chen
Gertjan J. Burghouts
E. Gavves
Cees G. M. Snoek
David W. Zhang
85
33
0
18 Mar 2024
Universal Neural Functionals
Allan Zhou
Chelsea Finn
James Harrison
80
15
0
07 Feb 2024
WARM: On the Benefits of Weight Averaged Reward Models
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
154
100
0
22 Jan 2024
Graph Metanetworks for Processing Diverse Neural Architectures
Derek Lim
Haggai Maron
Marc T. Law
Jonathan Lorraine
James Lucas
AI4CE
67
36
0
07 Dec 2023
Initializing Models with Larger Ones
Zhiqiu Xu
Yanjie Chen
Kirill Vishniakov
Yida Yin
Zhiqiang Shen
Trevor Darrell
Lingjie Liu
Zhuang Liu
58
21
0
30 Nov 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
290
11,858
0
18 Jul 2023
Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models
D. Honegger
Konstantin Schurholt
Damian Borth
61
4
0
26 Apr 2023
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Alexandre Ramé
Kartik Ahuja
Jianyu Zhang
Matthieu Cord
Léon Bottou
David Lopez-Paz
MoMe
OODD
70
83
0
20 Dec 2022
ZooD: Exploiting Model Zoo for Out-of-Distribution Generalization
Qishi Dong
Muhammad Awais
Fengwei Zhou
Chuanlong Xie
Tianyang Hu
Yongxin Yang
Sung-Ho Bae
Zhenguo Li
OODD
VLM
72
13
0
17 Oct 2022
Model Zoos: A Dataset of Diverse Populations of Neural Network Models
Konstantin Schurholt
Diyar Taskiran
Boris Knyazev
Xavier Giró-i-Nieto
Damian Borth
114
30
0
29 Sep 2022
Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights
Konstantin Schurholt
Boris Knyazev
Xavier Giró-i-Nieto
Damian Borth
104
42
0
29 Sep 2022
Learning to Learn with Generative Models of Neural Network Checkpoints
William S. Peebles
Ilija Radosavovic
Tim Brooks
Alexei A. Efros
Jitendra Malik
UQCV
121
68
0
26 Sep 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
287
330
0
11 Sep 2022
Hyper-Representations for Pre-Training and Transfer Learning
Konstantin Schurholt
Boris Knyazev
Xavier Giró-i-Nieto
Damian Borth
62
10
0
22 Jul 2022
Towards Learning Universal Hyperparameter Optimizers with Transformers
Yutian Chen
Xingyou Song
Chansoo Lee
Zehao Wang
Qiuyi Zhang
...
Greg Kochanski
Arnaud Doucet
MarcÁurelio Ranzato
Sagi Perel
Nando de Freitas
80
65
0
26 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
73
11
0
21 May 2022
Better plain ViT baselines for ImageNet-1k
Lucas Beyer
Xiaohua Zhai
Alexander Kolesnikov
ViT
VLM
63
116
0
03 May 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
136
981
1
10 Mar 2022
Stochastic Weight Averaging Revisited
Hao Guo
Jiyong Jin
B. Liu
54
30
0
03 Jan 2022
Parameter Prediction for Unseen Deep Architectures
Boris Knyazev
M. Drozdzal
Graham W. Taylor
Adriana Romero Soriano
OOD
81
83
0
25 Oct 2021
Robust fine-tuning of zero-shot models
Mitchell Wortsman
Gabriel Ilharco
Jong Wook Kim
Mike Li
Simon Kornblith
...
Raphael Gontijo-Lopes
Hannaneh Hajishirzi
Ali Farhadi
Hongseok Namkoong
Ludwig Schmidt
VLM
124
724
0
04 Sep 2021
Taxonomizing local versus global structure in neural network loss landscapes
Yaoqing Yang
Liam Hodgkinson
Ryan Theisen
Joe Zou
Joseph E. Gonzalez
Kannan Ramchandran
Michael W. Mahoney
82
37
0
23 Jul 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
107
632
0
18 Jun 2021
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
Xiangning Chen
Cho-Jui Hsieh
Boqing Gong
ViT
84
328
0
03 Jun 2021
Self-Supervised Pretraining Improves Self-Supervised Pretraining
Colorado Reed
Xiangyu Yue
Aniruddha Nrusimha
Sayna Ebrahimi
Vivek Vijaykumar
...
Shanghang Zhang
Devin Guillory
Sean L. Metzger
Kurt Keutzer
Trevor Darrell
69
108
0
23 Mar 2021
Training data-efficient image transformers & distillation through attention
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
377
6,762
0
23 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
637
41,003
0
22 Oct 2020
RobustBench: a standardized adversarial robustness benchmark
Francesco Croce
Maksym Andriushchenko
Vikash Sehwag
Edoardo Debenedetti
Nicolas Flammarion
M. Chiang
Prateek Mittal
Matthias Hein
VLM
316
702
0
19 Oct 2020
Predicting Neural Network Accuracy from Weights
Thomas Unterthiner
Daniel Keysers
Sylvain Gelly
Olivier Bousquet
Ilya O. Tolstikhin
59
105
0
26 Feb 2020
Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data
Charles H. Martin
Tongsu Peng
Peng
Michael W. Mahoney
80
108
0
17 Feb 2020
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
361
18,752
0
13 Feb 2020
Classifying the classifier: dissecting the weight space of neural networks
Gabriel Eilertsen
Daniel Jonsson
Timo Ropinski
Jonas Unger
Anders Ynnerman
51
54
0
13 Feb 2020
NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search
Xuanyi Dong
Yi Yang
135
711
0
02 Jan 2020
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
609
4,778
0
13 May 2019
Similarity of Neural Network Representations Revisited
Simon Kornblith
Mohammad Norouzi
Honglak Lee
Geoffrey E. Hinton
141
1,415
0
01 May 2019
Knowledge Flow: Improve Upon Your Teachers
Iou-Jen Liu
Jian-wei Peng
Alex Schwing
88
62
0
11 Apr 2019
NAS-Bench-101: Towards Reproducible Neural Architecture Search
Chris Ying
Aaron Klein
Esteban Real
Eric Christiansen
Kevin Patrick Murphy
Frank Hutter
80
683
0
25 Feb 2019
HyperGAN: A Generative Model for Diverse, Performant Neural Networks
Neale Ratzlaff
Fuxin Li
69
64
0
30 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.7K
94,770
0
11 Oct 2018
Averaging Weights Leads to Wider Optima and Better Generalization
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
121
1,659
0
14 Mar 2018
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries
Z. Yao
A. Gholami
Qi Lei
Kurt Keutzer
Michael W. Mahoney
63
167
0
22 Feb 2018
Population Based Training of Neural Networks
Max Jaderberg
Valentin Dalibard
Simon Osindero
Wojciech M. Czarnecki
Jeff Donahue
...
Tim Green
Iain Dunning
Karen Simonyan
Chrisantha Fernando
Koray Kavukcuoglu
71
741
0
27 Nov 2017
Decoupled Weight Decay Regularization
I. Loshchilov
Frank Hutter
OffRL
144
2,136
0
14 Nov 2017
Random Erasing Data Augmentation
Zhun Zhong
Liang Zheng
Guoliang Kang
Shaozi Li
Yi Yang
90
3,635
0
16 Aug 2017
1
2
Next