Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.13575
Cited By
Scaling MLPs: A Tale of Inductive Bias
23 June 2023
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling MLPs: A Tale of Inductive Bias"
17 / 17 papers shown
Title
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
92
0
0
27 Feb 2025
Exploring Kolmogorov-Arnold Networks for Interpretable Time Series Classification
Irina Barašin
Blaž Bertalanič
M. Mohorčič
Carolina Fortuna
AI4TS
154
2
0
22 Nov 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
60
20
0
27 Jun 2024
Kolmogorov-Arnold Networks (KANs) for Time Series Analysis
Cristian J. Vaca-Rubio
Luis Blanco
Roberto Pereira
Marius Caus
AI4TS
21
98
0
14 May 2024
Neural Redshift: Random Networks are not Random Functions
Damien Teney
A. Nicolicioiu
Valentin Hartmann
Ehsan Abbasnejad
103
19
0
04 Mar 2024
GLIMPSE: Generalized Local Imaging with MLPs
AmirEhsan Khorashadizadeh
Valentin Debarnot
Tianlin Liu
Ivan Dokmanić
36
1
0
01 Jan 2024
Transformer Fusion with Optimal Transport
Moritz Imfeld
Jacopo Graldi
Marco Giordano
Thomas Hofmann
Sotiris Anagnostidis
Sidak Pal Singh
ViT
MoMe
32
16
0
09 Oct 2023
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
48
8
0
07 Sep 2023
The Curious Case of Benign Memorization
Sotiris Anagnostidis
Gregor Bachmann
Lorenzo Noci
Thomas Hofmann
AAML
49
8
0
25 Oct 2022
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
225
402
0
24 Jan 2022
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
286
2,606
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
368
5,811
0
29 Apr 2021
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
187
689
0
22 Apr 2021
Towards Learning Convolutions from Scratch
Behnam Neyshabur
SSL
220
71
0
27 Jul 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,505
0
23 Jan 2020
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,892
0
15 Sep 2016
Convolution by Evolution: Differentiable Pattern Producing Networks
Chrisantha Fernando
Dylan Banarse
Malcolm Reynolds
F. Besse
David Pfau
Max Jaderberg
Marc Lanctot
Daan Wierstra
191
102
0
08 Jun 2016
1