A Model Zoo of Vision Transformers

14 April 2025

Papers citing "A Model Zoo of Vision Transformers"

50 / 56 papers shown

Title
Structure Is Not Enough: Leveraging Behavior for Neural Network Weight Reconstruction Léo Meynent Ivan Melev Konstantin Schurholt Göran Kauermann Damian Borth 101 3 0 21 Mar 2025
Local and Global Decoding in Text Generation Daniel Gareev Thomas Hofmann Ezhilmathi Krishnasamy Tiago Pimentel 65 4 0 14 Oct 2024
Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models Theo Putterman Derek Lim Yoav Gelberg Stefanie Jegelka Haggai Maron AI4CE 70 6 0 05 Oct 2024
MD tree: a model-diagnostic tree grown on loss landscape Yefan Zhou Jianlong Chen Qinxue Cao Konstantin Schürholt Yaoqing Yang 68 2 0 24 Jun 2024
Neural Lineage Runpeng Yu Xinchao Wang 84 4 0 17 Jun 2024
Towards Scalable and Versatile Weight Space Learning Konstantin Schurholt Michael W. Mahoney Damian Borth 82 17 0 14 Jun 2024
Graph Neural Networks for Learning Equivariant Representations of Neural Networks Miltiadis Kofinas Boris Knyazev Yan Zhang Yunlu Chen Gertjan J. Burghouts E. Gavves Cees G. M. Snoek David W. Zhang 85 33 0 18 Mar 2024
Universal Neural Functionals Allan Zhou Chelsea Finn James Harrison 80 15 0 07 Feb 2024
WARM: On the Benefits of Weight Averaged Reward Models Alexandre Ramé Nino Vieillard Léonard Hussenot Robert Dadashi Geoffrey Cideron Olivier Bachem Johan Ferret 154 100 0 22 Jan 2024
Graph Metanetworks for Processing Diverse Neural Architectures Derek Lim Haggai Maron Marc T. Law Jonathan Lorraine James Lucas AI4CE 67 36 0 07 Dec 2023
Initializing Models with Larger Ones Zhiqiu Xu Yanjie Chen Kirill Vishniakov Yida Yin Zhiqiang Shen Trevor Darrell Lingjie Liu Zhuang Liu 58 21 0 30 Nov 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ... Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom AI4MH ALM 290 11,858 0 18 Jul 2023
Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models D. Honegger Konstantin Schurholt Damian Borth 61 4 0 26 Apr 2023
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization Alexandre Ramé Kartik Ahuja Jianyu Zhang Matthieu Cord Léon Bottou David Lopez-Paz MoMe OODD 70 83 0 20 Dec 2022
ZooD: Exploiting Model Zoo for Out-of-Distribution Generalization Qishi Dong Muhammad Awais Fengwei Zhou Chuanlong Xie Tianyang Hu Yongxin Yang Sung-Ho Bae Zhenguo Li OODD VLM 72 13 0 17 Oct 2022
Model Zoos: A Dataset of Diverse Populations of Neural Network Models Konstantin Schurholt Diyar Taskiran Boris Knyazev Xavier Giró-i-Nieto Damian Borth 114 30 0 29 Sep 2022
Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights Konstantin Schurholt Boris Knyazev Xavier Giró-i-Nieto Damian Borth 104 42 0 29 Sep 2022
Learning to Learn with Generative Models of Neural Network Checkpoints William S. Peebles Ilija Radosavovic Tim Brooks Alexei A. Efros Jitendra Malik UQCV 121 68 0 26 Sep 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries Samuel K. Ainsworth J. Hayase S. Srinivasa MoMe 287 330 0 11 Sep 2022
Hyper-Representations for Pre-Training and Transfer Learning Konstantin Schurholt Boris Knyazev Xavier Giró-i-Nieto Damian Borth 62 10 0 22 Jul 2022
Towards Learning Universal Hyperparameter Optimizers with Transformers Yutian Chen Xingyou Song Chansoo Lee Zehao Wang Qiuyi Zhang ... Greg Kochanski Arnaud Doucet MarcÁurelio Ranzato Sagi Perel Nando de Freitas 80 65 0 26 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet Ethan Huynh ViT 73 11 0 21 May 2022
Better plain ViT baselines for ImageNet-1k Lucas Beyer Xiaohua Zhai Alexander Kolesnikov ViT VLM 63 116 0 03 May 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Mitchell Wortsman Gabriel Ilharco S. Gadre Rebecca Roelofs Raphael Gontijo-Lopes ... Hongseok Namkoong Ali Farhadi Y. Carmon Simon Kornblith Ludwig Schmidt MoMe 136 981 1 10 Mar 2022
Stochastic Weight Averaging Revisited Hao Guo Jiyong Jin B. Liu 54 30 0 03 Jan 2022
Parameter Prediction for Unseen Deep Architectures Boris Knyazev M. Drozdzal Graham W. Taylor Adriana Romero Soriano OOD 81 83 0 25 Oct 2021
Robust fine-tuning of zero-shot models Mitchell Wortsman Gabriel Ilharco Jong Wook Kim Mike Li Simon Kornblith ... Raphael Gontijo-Lopes Hannaneh Hajishirzi Ali Farhadi Hongseok Namkoong Ludwig Schmidt VLM 124 724 0 04 Sep 2021
Taxonomizing local versus global structure in neural network loss landscapes Yaoqing Yang Liam Hodgkinson Ryan Theisen Joe Zou Joseph E. Gonzalez Kannan Ramchandran Michael W. Mahoney 82 37 0 23 Jul 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers Andreas Steiner Alexander Kolesnikov Xiaohua Zhai Ross Wightman Jakob Uszkoreit Lucas Beyer ViT 107 632 0 18 Jun 2021
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations Xiangning Chen Cho-Jui Hsieh Boqing Gong ViT 84 328 0 03 Jun 2021
Self-Supervised Pretraining Improves Self-Supervised Pretraining Colorado Reed Xiangyu Yue Aniruddha Nrusimha Sayna Ebrahimi Vivek Vijaykumar ... Shanghang Zhang Devin Guillory Sean L. Metzger Kurt Keutzer Trevor Darrell 69 108 0 23 Mar 2021
Training data-efficient image transformers & distillation through attention Hugo Touvron Matthieu Cord Matthijs Douze Francisco Massa Alexandre Sablayrolles Hervé Jégou ViT 377 6,762 0 23 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai ... Matthias Minderer G. Heigold Sylvain Gelly Jakob Uszkoreit N. Houlsby ViT 637 41,003 0 22 Oct 2020
RobustBench: a standardized adversarial robustness benchmark Francesco Croce Maksym Andriushchenko Vikash Sehwag Edoardo Debenedetti Nicolas Flammarion M. Chiang Prateek Mittal Matthias Hein VLM 316 702 0 19 Oct 2020
Predicting Neural Network Accuracy from Weights Thomas Unterthiner Daniel Keysers Sylvain Gelly Olivier Bousquet Ilya O. Tolstikhin 59 105 0 26 Feb 2020
Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data Charles H. Martin Tongsu Peng Peng Michael W. Mahoney 80 108 0 17 Feb 2020
A Simple Framework for Contrastive Learning of Visual Representations Ting-Li Chen Simon Kornblith Mohammad Norouzi Geoffrey E. Hinton SSL 361 18,752 0 13 Feb 2020
Classifying the classifier: dissecting the weight space of neural networks Gabriel Eilertsen Daniel Jonsson Timo Ropinski Jonas Unger Anders Ynnerman 51 54 0 13 Feb 2020
NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search Xuanyi Dong Yi Yang 135 711 0 02 Jan 2020
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features Sangdoo Yun Dongyoon Han Seong Joon Oh Sanghyuk Chun Junsuk Choe Y. Yoo OOD 609 4,778 0 13 May 2019
Similarity of Neural Network Representations Revisited Simon Kornblith Mohammad Norouzi Honglak Lee Geoffrey E. Hinton 141 1,415 0 01 May 2019
Knowledge Flow: Improve Upon Your Teachers Iou-Jen Liu Jian-wei Peng Alex Schwing 88 62 0 11 Apr 2019
NAS-Bench-101: Towards Reproducible Neural Architecture Search Chris Ying Aaron Klein Esteban Real Eric Christiansen Kevin Patrick Murphy Frank Hutter 80 683 0 25 Feb 2019
HyperGAN: A Generative Model for Diverse, Performant Neural Networks Neale Ratzlaff Fuxin Li 69 64 0 30 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.7K 94,770 0 11 Oct 2018
Averaging Weights Leads to Wider Optima and Better Generalization Pavel Izmailov Dmitrii Podoprikhin T. Garipov Dmitry Vetrov A. Wilson FedML MoMe 121 1,659 0 14 Mar 2018
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries Z. Yao A. Gholami Qi Lei Kurt Keutzer Michael W. Mahoney 63 167 0 22 Feb 2018
Population Based Training of Neural Networks Max Jaderberg Valentin Dalibard Simon Osindero Wojciech M. Czarnecki Jeff Donahue ... Tim Green Iain Dunning Karen Simonyan Chrisantha Fernando Koray Kavukcuoglu 71 741 0 27 Nov 2017
Decoupled Weight Decay Regularization I. Loshchilov Frank Hutter OffRL 144 2,136 0 14 Nov 2017
Random Erasing Data Augmentation Zhun Zhong Liang Zheng Guoliang Kang Shaozi Li Yi Yang 90 3,635 0 16 Aug 2017