Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.00595
Cited By
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
1 April 2022
Tri Dao
Beidi Chen
N. Sohoni
Arjun D Desai
Michael Poli
Jessica Grogan
Alexander Liu
Aniruddh Rao
Atri Rudra
Christopher Ré
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Monarch: Expressive Structured Matrices for Efficient and Accurate Training"
16 / 66 papers shown
Title
Convolution-enhanced Evolving Attention Networks
Yujing Wang
Yaming Yang
Zhuowan Li
Jiangang Bai
Mingliang Zhang
Xiangtai Li
Jiahao Yu
Ce Zhang
Gao Huang
Yu Tong
ViT
27
6
0
16 Dec 2022
Guiding continuous operator learning through Physics-based boundary constraints
Nadim Saad
Gaurav Gupta
S. Alizadeh
Danielle C. Maddix
AI4CE
48
20
0
14 Dec 2022
RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations
Zirui Liu
Sheng-Wei Chen
Kaixiong Zhou
Daochen Zha
Xiao Huang
Xia Hu
32
15
0
19 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
31
47
0
13 Oct 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Hongxiang Fan
Thomas C. P. Chau
Stylianos I. Venieris
Royson Lee
Alexandros Kouris
Wayne Luk
Nicholas D. Lane
Mohamed S. Abdelfattah
40
58
0
20 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
33
109
0
31 Aug 2022
A Structured Sparse Neural Network and Its Matrix Calculations Algorithm
S. Sarayi
M. Bahrami
13
0
0
02 Jul 2022
Arithmetic Circuits, Structured Matrices and (not so) Deep Learning
Atri Rudra
21
1
0
24 Jun 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
104
2,055
0
27 May 2022
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models
Xuxi Chen
Tianlong Chen
Weizhu Chen
Ahmed Hassan Awadallah
Zhangyang Wang
Yu Cheng
MoE
ALM
20
10
0
30 Oct 2021
Efficient Identification of Butterfly Sparse Matrix Factorizations
Léon Zheng
E. Riccietti
Rémi Gribonval
44
6
0
04 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
292
2,611
0
04 May 2021
Initialization and Regularization of Factorized Neural Layers
M. Khodak
Neil A. Tenenholtz
Lester W. Mackey
Nicolò Fusi
65
56
0
03 May 2021
Fourier Neural Operator for Parametric Partial Differential Equations
Zong-Yi Li
Nikola B. Kovachki
Kamyar Azizzadenesheli
Burigede Liu
K. Bhattacharya
Andrew M. Stuart
Anima Anandkumar
AI4CE
262
2,309
0
18 Oct 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,505
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,833
0
17 Sep 2019
Previous
1
2