ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.13891
  4. Cited By
Transformers from an Optimization Perspective

Transformers from an Optimization Perspective

27 May 2022
Yongyi Yang
Zengfeng Huang
David Wipf
ArXivPDFHTML

Papers citing "Transformers from an Optimization Perspective"

9 / 9 papers shown
Title
ICLR: In-Context Learning of Representations
ICLR: In-Context Learning of Representations
Core Francisco Park
Andrew Lee
Ekdeep Singh Lubana
Yongyi Yang
Maya Okawa
Kento Nishi
Martin Wattenberg
Hidenori Tanaka
AIFin
118
3
0
29 Dec 2024
iMixer: hierarchical Hopfield network implies an invertible, implicit
  and iterative MLP-Mixer
iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer
Toshihiro Ota
Masato Taki
29
2
0
25 Apr 2023
Optimization-Derived Learning with Essential Convergence Analysis of
  Training and Hyper-training
Optimization-Derived Learning with Essential Convergence Analysis of Training and Hyper-training
Risheng Liu
Xuan Liu
Shangzhi Zeng
Jin Zhang
Yixuan Zhang
40
6
0
16 Jun 2022
IGLU: Efficient GCN Training via Lazy Updates
IGLU: Efficient GCN Training via Lazy Updates
S. Narayanan
Aditya Sinha
Prateek Jain
Purushottam Kar
Sundararajan Sellamanickam
BDL
52
9
0
28 Sep 2021
Is Attention Better Than Matrix Decomposition?
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
56
137
0
09 Sep 2021
Elastic Graph Neural Networks
Elastic Graph Neural Networks
Xiaorui Liu
W. Jin
Yao Ma
Yaxin Li
Hua Liu
Yiqi Wang
Ming Yan
Jiliang Tang
92
107
0
05 Jul 2021
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
M. Bronstein
Joan Bruna
Taco S. Cohen
Petar Velivcković
GNN
174
1,104
0
27 Apr 2021
Input Convex Neural Networks
Input Convex Neural Networks
Brandon Amos
Lei Xu
J. Zico Kolter
178
598
0
22 Sep 2016
A Proximal Stochastic Gradient Method with Progressive Variance
  Reduction
A Proximal Stochastic Gradient Method with Progressive Variance Reduction
Lin Xiao
Tong Zhang
ODL
84
736
0
19 Mar 2014
1