Fast and Optimal Weight Update for Pruned Large Language Models

1 January 2024

Papers citing "Fast and Optimal Weight Update for Pruned Large Language Models"

9 / 9 papers shown

Title
Addition is almost all you need: Compressing neural networks with double binary factorization Vladimír Boža Vladimír Macko MQ 29 0 0 16 May 2025
Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization Vladimír Boža Vladimír Macko 42 1 0 27 Sep 2024
DistiLLM: Towards Streamlined Distillation for Large Language Models Jongwoo Ko Sungnyun Kim Tianyi Chen SeYoung Yun 71 27 0 06 Feb 2024
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Saleh Ashkboos Maximilian L. Croci Marcelo Gennari do Nascimento Torsten Hoefler James Hensman VLM 132 148 0 26 Jan 2024
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes Lokesh Nagalapatti Chun-Liang Li Chih-Kuan Yeh Hootan Nakhost Yasuhisa Fujii Alexander Ratner Ranjay Krishna Chen-Yu Lee Tomas Pfister ALM 224 506 0 03 May 2023
Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization Frederik Benzing ODL 48 23 0 28 Jan 2022
Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks Itay Hubara Brian Chmiel Moshe Island Ron Banner S. Naor Daniel Soudry 59 111 0 16 Feb 2021
What is the State of Neural Network Pruning? Davis W. Blalock Jose Javier Gonzalez Ortiz Jonathan Frankle John Guttag 191 1,032 0 06 Mar 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 304 7,005 0 20 Apr 2018