Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.13891
Cited By
Transformers from an Optimization Perspective
27 May 2022
Yongyi Yang
Zengfeng Huang
David Wipf
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformers from an Optimization Perspective"
9 / 9 papers shown
Title
ICLR: In-Context Learning of Representations
Core Francisco Park
Andrew Lee
Ekdeep Singh Lubana
Yongyi Yang
Maya Okawa
Kento Nishi
Martin Wattenberg
Hidenori Tanaka
AIFin
118
3
0
29 Dec 2024
iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer
Toshihiro Ota
Masato Taki
29
2
0
25 Apr 2023
Optimization-Derived Learning with Essential Convergence Analysis of Training and Hyper-training
Risheng Liu
Xuan Liu
Shangzhi Zeng
Jin Zhang
Yixuan Zhang
40
6
0
16 Jun 2022
IGLU: Efficient GCN Training via Lazy Updates
S. Narayanan
Aditya Sinha
Prateek Jain
Purushottam Kar
Sundararajan Sellamanickam
BDL
52
9
0
28 Sep 2021
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
56
137
0
09 Sep 2021
Elastic Graph Neural Networks
Xiaorui Liu
W. Jin
Yao Ma
Yaxin Li
Hua Liu
Yiqi Wang
Ming Yan
Jiliang Tang
92
107
0
05 Jul 2021
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
M. Bronstein
Joan Bruna
Taco S. Cohen
Petar Velivcković
GNN
174
1,104
0
27 Apr 2021
Input Convex Neural Networks
Brandon Amos
Lei Xu
J. Zico Kolter
178
598
0
22 Sep 2016
A Proximal Stochastic Gradient Method with Progressive Variance Reduction
Lin Xiao
Tong Zhang
ODL
84
736
0
19 Mar 2014
1