ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.03228
  4. Cited By
Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT
  Compression

Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression

7 September 2021
Canwen Xu
Wangchunshu Zhou
Tao Ge
Kelvin J. Xu
Julian McAuley
Furu Wei
ArXivPDFHTML

Papers citing "Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression"

37 / 37 papers shown
Title
As easy as PIE: understanding when pruning causes language models to disagree
As easy as PIE: understanding when pruning causes language models to disagree
Pietro Tropeano
Maria Maistro
Tuukka Ruotsalo
Christina Lioma
60
0
0
27 Mar 2025
Superficial Safety Alignment Hypothesis
Superficial Safety Alignment Hypothesis
Jianwei Li
Jung-Eun Kim
29
1
0
07 Oct 2024
Greedy Output Approximation: Towards Efficient Structured Pruning for
  LLMs Without Retraining
Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining
Jianwei Li
Yijun Dong
Qi Lei
35
5
0
26 Jul 2024
Accuracy is Not All You Need
Accuracy is Not All You Need
Abhinav Dutta
Sanjeev Krishnan
Nipun Kwatra
Ramachandran Ramjee
54
3
0
12 Jul 2024
Understanding the Effect of Model Compression on Social Bias in Large
  Language Models
Understanding the Effect of Model Compression on Social Bias in Large Language Models
Gustavo Gonçalves
Emma Strubell
23
10
0
09 Dec 2023
Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy
  for Language Models
Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy for Language Models
Jianwei Li
Qi Lei
Wei Cheng
Dongkuan Xu
KELM
33
3
0
19 Oct 2023
A Survey on Model Compression for Large Language Models
A Survey on Model Compression for Large Language Models
Xunyu Zhu
Jian Li
Yong Liu
Can Ma
Weiping Wang
36
196
0
15 Aug 2023
Modular Transformers: Compressing Transformers into Modularized Layers
  for Flexible Efficient Inference
Modular Transformers: Compressing Transformers into Modularized Layers for Flexible Efficient Inference
Wangchunshu Zhou
Ronan Le Bras
Yejin Choi
21
0
0
04 Jun 2023
Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
Eugenia Iofinova
Alexandra Peste
Dan Alistarh
28
9
0
25 Apr 2023
Bridging Fairness and Environmental Sustainability in Natural Language
  Processing
Bridging Fairness and Environmental Sustainability in Natural Language Processing
Marius Hessenthaler
Emma Strubell
Dirk Hovy
Anne Lauscher
29
8
0
08 Nov 2022
Robust Lottery Tickets for Pre-trained Language Models
Robust Lottery Tickets for Pre-trained Language Models
Rui Zheng
Rong Bao
Yuhao Zhou
Di Liang
Sirui Wang
Wei Wu
Tao Gui
Qi Zhang
Xuanjing Huang
AAML
32
13
0
06 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Intriguing Properties of Compression on Multilingual Models
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
28
12
0
04 Nov 2022
Gradient Knowledge Distillation for Pre-trained Language Models
Gradient Knowledge Distillation for Pre-trained Language Models
Lean Wang
Lei Li
Xu Sun
VLM
28
5
0
02 Nov 2022
Compressing And Debiasing Vision-Language Pre-Trained Models for Visual
  Question Answering
Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering
Q. Si
Yuanxin Liu
Zheng Lin
Peng Fu
Weiping Wang
VLM
42
1
0
26 Oct 2022
Improving Imbalanced Text Classification with Dynamic Curriculum
  Learning
Improving Imbalanced Text Classification with Dynamic Curriculum Learning
Xulong Zhang
Jianzong Wang
Ning Cheng
Jing Xiao
33
4
0
25 Oct 2022
EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge
  Distillation and Modal-adaptive Pruning
EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning
Tiannan Wang
Wangchunshu Zhou
Yan Zeng
Xinsong Zhang
VLM
36
37
0
14 Oct 2022
A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models
A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models
Yuanxin Liu
Fandong Meng
Zheng Lin
JiangNan Li
Peng Fu
Yanan Cao
Weiping Wang
Jie Zhou
39
5
0
11 Oct 2022
Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
35
109
0
31 Aug 2022
Recall Distortion in Neural Network Pruning and the Undecayed Pruning
  Algorithm
Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm
Aidan Good
Jia-Huei Lin
Hannah Sieg
Mikey Ferguson
Xin Yu
Shandian Zhe
J. Wieczorek
Thiago Serra
39
11
0
07 Jun 2022
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
Wangchunshu Zhou
Yan Zeng
Shizhe Diao
Xinsong Zhang
CoGe
VLM
32
13
0
30 May 2022
Parameter-Efficient and Student-Friendly Knowledge Distillation
Parameter-Efficient and Student-Friendly Knowledge Distillation
Jun Rao
Xv Meng
Liang Ding
Shuhan Qi
Dacheng Tao
37
46
0
28 May 2022
Pruning has a disparate impact on model accuracy
Pruning has a disparate impact on model accuracy
Cuong Tran
Ferdinando Fioretto
Jung-Eun Kim
Rakshit Naidu
43
38
0
26 May 2022
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More
  Compressible Models
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models
Clara Na
Sanket Vaibhav Mehta
Emma Strubell
66
19
0
25 May 2022
What Do Compressed Multilingual Machine Translation Models Forget?
What Do Compressed Multilingual Machine Translation Models Forget?
Alireza Mohammadshahi
Vassilina Nikoulina
Alexandre Berard
Caroline Brun
James Henderson
Laurent Besacier
AI4CE
44
9
0
22 May 2022
"I'm sorry to hear that": Finding New Biases in Language Models with a
  Holistic Descriptor Dataset
"I'm sorry to hear that": Finding New Biases in Language Models with a Holistic Descriptor Dataset
Eric Michael Smith
Melissa Hall
Melanie Kambadur
Eleonora Presani
Adina Williams
83
130
0
18 May 2022
Feature Structure Distillation with Centered Kernel Alignment in BERT
  Transferring
Feature Structure Distillation with Centered Kernel Alignment in BERT Transferring
Heeseung Jung
Doyeon Kim
Seung-Hoon Na
Kangil Kim
27
5
0
01 Apr 2022
A Survey on Model Compression and Acceleration for Pretrained Language
  Models
A Survey on Model Compression and Acceleration for Pretrained Language Models
Canwen Xu
Julian McAuley
25
58
0
15 Feb 2022
A Survey on Dynamic Neural Networks for Natural Language Processing
A Survey on Dynamic Neural Networks for Natural Language Processing
Canwen Xu
Julian McAuley
AI4CE
30
28
0
15 Feb 2022
Distilling the Knowledge of Romanian BERTs Using Multiple Teachers
Distilling the Knowledge of Romanian BERTs Using Multiple Teachers
Andrei-Marius Avram
Darius Catrina
Dumitru-Clementin Cercel
Mihai Dascualu
Traian Rebedea
Vasile Puaics
Dan Tufics
22
12
0
23 Dec 2021
A Survey on Green Deep Learning
A Survey on Green Deep Learning
Jingjing Xu
Wangchunshu Zhou
Zhiyi Fu
Hao Zhou
Lei Li
VLM
83
83
0
08 Nov 2021
Robustness Challenges in Model Distillation and Pruning for Natural
  Language Understanding
Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding
Mengnan Du
Subhabrata Mukherjee
Yu Cheng
Milad Shokouhi
Xia Hu
Ahmed Hassan Awadallah
55
13
0
16 Oct 2021
BERT Learns to Teach: Knowledge Distillation with Meta Learning
BERT Learns to Teach: Knowledge Distillation with Meta Learning
Wangchunshu Zhou
Canwen Xu
Julian McAuley
36
87
0
08 Jun 2021
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
231
198
0
07 Feb 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,833
0
17 Sep 2019
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
236
578
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
6,996
0
20 Apr 2018
Adversarial examples in the physical world
Adversarial examples in the physical world
Alexey Kurakin
Ian Goodfellow
Samy Bengio
SILM
AAML
326
5,847
0
08 Jul 2016
1