ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.03599
  4. Cited By
Penalizing Gradient Norm for Efficiently Improving Generalization in
  Deep Learning
v1v2v3 (latest)

Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning

8 February 2022
Yang Zhao
Hao Zhang
Xiuyuan Hu
ArXiv (abs)PDFHTMLGithub (39★)

Papers citing "Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning"

25 / 25 papers shown
Title
Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization
Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization
C. Tan
Yubo Zhou
Haishan Ye
Guang Dai
Junmin Liu
Zengjie Song
Jiangshe Zhang
Zixiang Zhao
Yunda Hao
Yong Xu
AAML
34
0
0
29 May 2025
Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning
Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning
Sunwoo Lee
139
0
0
18 Mar 2025
Do we really have to filter out random noise in pre-training data for language models?
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
193
6
0
10 Feb 2025
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
Mingjie Li
Wai Man Si
Michael Backes
Yang Zhang
Yisen Wang
118
19
0
03 Jan 2025
Tilted Sharpness-Aware Minimization
Tilted Sharpness-Aware Minimization
Tian Li
Dinesh Manocha
J. Bilmes
78
0
0
30 Oct 2024
PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization
PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization
Yao Ni
Shan Zhang
Piotr Koniusz
461
8
0
25 Sep 2024
Enhancing Domain Adaptation through Prompt Gradient Alignment
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
182
1
0
13 Jun 2024
SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data
SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data
Sakshi Choudhary
Sai Aparna Aketi
Kaushik Roy
FedML
94
0
0
22 May 2024
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks
Yichu Xu
Xin-Chun Li
Lan Li
De-Chuan Zhan
89
2
0
21 May 2024
Beyond Single-Model Views for Deep Learning: Optimization versus
  Generalizability of Stochastic Optimization Algorithms
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms
Toki Tahmid Inan
Mingrui Liu
Amarda Shehu
61
0
0
01 Mar 2024
Neglected Hessian component explains mysteries in Sharpness
  regularization
Neglected Hessian component explains mysteries in Sharpness regularization
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
116
7
0
19 Jan 2024
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift
Renchunzi Xie
Ambroise Odonnat
Vasilii Feofanov
I. Redko
Jianfeng Zhang
Bo An
UQCV
150
1
0
17 Jan 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
160
2
0
29 Nov 2023
Small batch deep reinforcement learning
Small batch deep reinforcement learning
J. Obando-Ceron
Marc G. Bellemare
Pablo Samuel Castro
VLM
101
19
0
05 Oct 2023
Flatness-Aware Minimization for Domain Generalization
Flatness-Aware Minimization for Domain Generalization
Xingxuan Zhang
Renzhe Xu
Han Yu
Yancheng Dong
Pengfei Tian
Peng Cu
85
22
0
20 Jul 2023
Why Does Little Robustness Help? Understanding and Improving Adversarial
  Transferability from Surrogate Training
Why Does Little Robustness Help? Understanding and Improving Adversarial Transferability from Surrogate Training
Yechao Zhang
Shengshan Hu
Leo Yu Zhang
Junyu Shi
Minghui Li
Xiaogeng Liu
Wei Wan
Hai Jin
AAML
145
24
0
15 Jul 2023
An Adaptive Policy to Employ Sharpness-Aware Minimization
An Adaptive Policy to Employ Sharpness-Aware Minimization
Weisen Jiang
Hansi Yang
Yu Zhang
James T. Kwok
AAML
130
34
0
28 Apr 2023
Per-Example Gradient Regularization Improves Learning Signals from Noisy
  Data
Per-Example Gradient Regularization Improves Learning Signals from Noisy Data
Xuran Meng
Yuan Cao
Difan Zou
69
5
0
31 Mar 2023
Improving Differentiable Architecture Search via Self-Distillation
Improving Differentiable Architecture Search via Self-Distillation
Xunyu Zhu
Jian Li
Yong Liu
Weiping Wang
99
8
0
11 Feb 2023
Improving the Model Consistency of Decentralized Federated Learning
Improving the Model Consistency of Decentralized Federated Learning
Yi Shi
Li Shen
Kang Wei
Yan Sun
Bo Yuan
Xueqian Wang
Dacheng Tao
FedML
114
52
0
08 Feb 2023
Improving Multi-task Learning via Seeking Task-based Flat Regions
Improving Multi-task Learning via Seeking Task-based Flat Regions
Hoang Phan
Lam C. Tran
Ngoc N. Tran
Nhat Ho
Tuan Truong
Qi Lei
Nhat Ho
Dinh Q. Phung
Trung Le
209
11
0
24 Nov 2022
How Does Sharpness-Aware Minimization Minimize Sharpness?
How Does Sharpness-Aware Minimization Minimize Sharpness?
Kaiyue Wen
Tengyu Ma
Zhiyuan Li
AAML
85
50
0
10 Nov 2022
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation
  Approach
Make Sharpness-Aware Minimization Stronger: A Sparsified Perturbation Approach
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Xiaoshuai Sun
Rongrong Ji
Dacheng Tao
AAML
121
71
0
11 Oct 2022
Understanding Gradient Regularization in Deep Learning: Efficient
  Finite-Difference Computation and Implicit Bias
Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias
Ryo Karakida
Tomoumi Takase
Tomohiro Hayase
Kazuki Osawa
61
16
0
06 Oct 2022
On the Overlooked Pitfalls of Weight Decay and How to Mitigate Them: A
  Gradient-Norm Perspective
On the Overlooked Pitfalls of Weight Decay and How to Mitigate Them: A Gradient-Norm Perspective
Zeke Xie
Zhiqiang Xu
Jingzhao Zhang
Issei Sato
Masashi Sugiyama
87
25
0
23 Nov 2020
1