Monarch: Expressive Structured Matrices for Efficient and Accurate
Training

Monarch: Expressive Structured Matrices for Efficient and Accurate Training

1 April 2022

Christopher Ré

Papers citing "Monarch: Expressive Structured Matrices for Efficient and Accurate Training"

16 / 66 papers shown

Title
Convolution-enhanced Evolving Attention Networks Yujing Wang Yaming Yang Zhuowan Li Jiangang Bai Mingliang Zhang Xiangtai Li Jiahao Yu Ce Zhang Gao Huang Yu Tong ViT 27 6 0 16 Dec 2022
Guiding continuous operator learning through Physics-based boundary constraints Nadim Saad Gaurav Gupta S. Alizadeh Danielle C. Maddix AI4CE 48 20 0 14 Dec 2022
RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations Zirui Liu Sheng-Wei Chen Kaixiong Zhou Daochen Zha Xiao Huang Xia Hu 32 15 0 19 Oct 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities Brian Bartoldson B. Kailkhura Davis W. Blalock 31 47 0 13 Oct 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design Hongxiang Fan Thomas C. P. Chau Stylianos I. Venieris Royson Lee Alexandros Kouris Wayne Luk Nicholas D. Lane Mohamed S. Abdelfattah 40 58 0 20 Sep 2022
Efficient Methods for Natural Language Processing: A Survey Marcos Vinícius Treviso Ji-Ung Lee Tianchu Ji Betty van Aken Qingqing Cao ... Emma Strubell Niranjan Balasubramanian Leon Derczynski Iryna Gurevych Roy Schwartz 33 109 0 31 Aug 2022
A Structured Sparse Neural Network and Its Matrix Calculations Algorithm S. Sarayi M. Bahrami 13 0 0 02 Jul 2022
Arithmetic Circuits, Structured Matrices and (not so) Deep Learning Atri Rudra 21 1 0 24 Jun 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Tri Dao Daniel Y. Fu Stefano Ermon Atri Rudra Christopher Ré VLM 104 2,055 0 27 May 2022
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models Xuxi Chen Tianlong Chen Weizhu Chen Ahmed Hassan Awadallah Zhangyang Wang Yu Cheng MoE ALM 20 10 0 30 Oct 2021
Efficient Identification of Butterfly Sparse Matrix Factorizations Léon Zheng E. Riccietti Rémi Gribonval 44 6 0 04 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 292 2,611 0 04 May 2021
Initialization and Regularization of Factorized Neural Layers M. Khodak Neil A. Tenenholtz Lester W. Mackey Nicolò Fusi 65 56 0 03 May 2021
Fourier Neural Operator for Parametric Partial Differential Equations Zong-Yi Li Nikola B. Kovachki Kamyar Azizzadenesheli Burigede Liu K. Bhattacharya Andrew M. Stuart Anima Anandkumar AI4CE 262 2,309 0 18 Oct 2020
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 264 4,505 0 23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism M. Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 245 1,833 0 17 Sep 2019