Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.11052
Cited By
Convexifying Transformers: Improving optimization and understanding of transformer networks
20 November 2022
Tolga Ergen
Behnam Neyshabur
Harsh Mehta
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Convexifying Transformers: Improving optimization and understanding of transformer networks"
9 / 9 papers shown
Title
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
42
15
0
08 Feb 2024
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models
Tolga Ergen
Mert Pilanci
26
2
0
19 Dec 2023
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions
Aaron Mishkin
Arda Sahiner
Mert Pilanci
OffRL
80
30
0
02 Feb 2022
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Tolga Ergen
Mert Pilanci
37
16
0
18 Oct 2021
Parallel Deep Neural Networks Have Zero Duality Gap
Yifei Wang
Tolga Ergen
Mert Pilanci
79
10
0
13 Oct 2021
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
62
139
0
09 Sep 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
306
2,615
0
04 May 2021
Fourier Neural Operator for Parametric Partial Differential Equations
Zong-Yi Li
Nikola B. Kovachki
Kamyar Azizzadenesheli
Burigede Liu
K. Bhattacharya
Andrew M. Stuart
Anima Anandkumar
AI4CE
271
2,320
0
18 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
314
7,020
0
20 Apr 2018
1