Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.11052
Cited By
Convexifying Transformers: Improving optimization and understanding of transformer networks
20 November 2022
Tolga Ergen
Behnam Neyshabur
Harsh Mehta
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Convexifying Transformers: Improving optimization and understanding of transformer networks"
9 / 9 papers shown
Title
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
42
15
0
08 Feb 2024
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models
Tolga Ergen
Mert Pilanci
21
2
0
19 Dec 2023
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions
Aaron Mishkin
Arda Sahiner
Mert Pilanci
OffRL
77
30
0
02 Feb 2022
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Tolga Ergen
Mert Pilanci
34
16
0
18 Oct 2021
Parallel Deep Neural Networks Have Zero Duality Gap
Yifei Wang
Tolga Ergen
Mert Pilanci
79
10
0
13 Oct 2021
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
62
138
0
09 Sep 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
304
2,611
0
04 May 2021
Fourier Neural Operator for Parametric Partial Differential Equations
Zong-Yi Li
Nikola B. Kovachki
Kamyar Azizzadenesheli
Burigede Liu
K. Bhattacharya
Andrew M. Stuart
Anima Anandkumar
AI4CE
271
2,315
0
18 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
7,005
0
20 Apr 2018
1