ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.03599
  4. Cited By
Penalizing Gradient Norm for Efficiently Improving Generalization in
  Deep Learning

Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning

8 February 2022
Yang Zhao
Hao Zhang
Xiuyuan Hu
ArXivPDFHTML

Papers citing "Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning"

50 / 83 papers shown
Title
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
Yeoreum Lee
Jinwook Jung
Sungyong Baik
MoMe
42
0
0
20 Apr 2025
Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning
Layer-wise Adaptive Gradient Norm Penalizing Method for Efficient and Accurate Deep Learning
Sunwoo Lee
112
0
0
18 Mar 2025
Convergence Analysis of Federated Learning Methods Using Backward Error Analysis
Jinwoo Lim
Suhyun Kim
Soo-Mook Moon
FedML
60
0
0
05 Mar 2025
Do we really have to filter out random noise in pre-training data for language models?
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
83
2
0
10 Feb 2025
Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning
Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning
Rémy Hosseinkhan Boucher
Onofrio Semeraro
L. Mathelin
82
0
0
28 Jan 2025
Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm
Preconditioned Sharpness-Aware Minimization: Unifying Analysis and a Novel Learning Algorithm
Yilang Zhang
Bingcong Li
G. Giannakis
AAML
39
0
0
11 Jan 2025
Deferred Poisoning: Making the Model More Vulnerable via Hessian
  Singularization
Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization
Yuhao He
Jinyu Tian
Xianwei Zheng
Li Dong
Yuanman Li
L. Zhang
AAML
28
0
0
06 Nov 2024
Reweighting Local Mimina with Tilted SAM
Reweighting Local Mimina with Tilted SAM
Tian Li
Dinesh Manocha
J. Bilmes
33
0
0
30 Oct 2024
Implicit Regularization of Sharpness-Aware Minimization for
  Scale-Invariant Problems
Implicit Regularization of Sharpness-Aware Minimization for Scale-Invariant Problems
Bingcong Li
Liang Zhang
Niao He
43
3
0
18 Oct 2024
Sharpness-Aware Black-Box Optimization
Sharpness-Aware Black-Box Optimization
Feiyang Ye
Yueming Lyu
Xuehao Wang
Masashi Sugiyama
Yu-Jie Zhang
Ivor W. Tsang
AAML
47
0
0
16 Oct 2024
Combinatorial Multi-armed Bandits: Arm Selection via Group Testing
Combinatorial Multi-armed Bandits: Arm Selection via Group Testing
Arpan Mukherjee
Shashanka Ubaru
K. Murugesan
Karthikeyan Shanmugam
A. Tajer
41
1
0
14 Oct 2024
Understanding Adversarially Robust Generalization via Weight-Curvature
  Index
Understanding Adversarially Robust Generalization via Weight-Curvature Index
Yuelin Xu
Xiao Zhang
AAML
32
0
0
10 Oct 2024
Preconditioning for Accelerated Gradient Descent Optimization and
  Regularization
Preconditioning for Accelerated Gradient Descent Optimization and Regularization
Qiang Ye
AI4CE
26
0
0
30 Sep 2024
Neural Network Plasticity and Loss Sharpness
Neural Network Plasticity and Loss Sharpness
Max Koster
Jude Kukla
23
0
0
25 Sep 2024
PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization
PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization
Yao Ni
Shan Zhang
Piotr Koniusz
151
2
0
25 Sep 2024
Scaling Diffusion Policy in Transformer to 1 Billion Parameters for
  Robotic Manipulation
Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation
Minjie Zhu
Yichen Zhu
Jinming Li
Junjie Wen
Zhiyuan Xu
...
Ran Cheng
Chaomin Shen
Yaxin Peng
Feifei Feng
Jian Tang
38
13
0
22 Sep 2024
Bilateral Sharpness-Aware Minimization for Flatter Minima
Bilateral Sharpness-Aware Minimization for Flatter Minima
Jiaxin Deng
Junbiao Pang
Baochang Zhang
Qingming Huang
AAML
121
0
0
20 Sep 2024
Enhancing Sharpness-Aware Minimization by Learning Perturbation Radius
Enhancing Sharpness-Aware Minimization by Learning Perturbation Radius
Xuehao Wang
Weisen Jiang
Shuai Fu
Yu Zhang
AAML
47
0
0
15 Aug 2024
DataFreeShield: Defending Adversarial Attacks without Training Data
DataFreeShield: Defending Adversarial Attacks without Training Data
Hyeyoon Lee
Kanghyun Choi
Dain Kwon
Sunjong Park
Mayoore S. Jaiswal
Noseong Park
Jonghyun Choi
Jinho Lee
36
0
0
21 Jun 2024
When Will Gradient Regularization Be Harmful?
When Will Gradient Regularization Be Harmful?
Yang Zhao
Hao Zhang
Xiuyuan Hu
AI4CE
34
1
0
14 Jun 2024
Enhancing Domain Adaptation through Prompt Gradient Alignment
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
52
0
0
13 Jun 2024
The EarlyBird Gets the WORM: Heuristically Accelerating EarlyBird
  Convergence
The EarlyBird Gets the WORM: Heuristically Accelerating EarlyBird Convergence
Adithya Vasudev
23
0
0
31 May 2024
Sharpness-Aware Minimization Enhances Feature Quality via Balanced
  Learning
Sharpness-Aware Minimization Enhances Feature Quality via Balanced Learning
Jacob Mitchell Springer
Vaishnavh Nagarajan
Aditi Raghunathan
44
5
0
30 May 2024
Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning
  Weight Changes and Backdoor Activeness
Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness
Weilin Lin
Li Liu
Shaokui Wei
Jianze Li
Hui Xiong
AAML
50
2
0
30 May 2024
Locally Estimated Global Perturbations are Better than Local
  Perturbations for Federated Sharpness-aware Minimization
Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization
Ziqing Fan
Shengchao Hu
Jiangchao Yao
Gang Niu
Ya-Qin Zhang
Masashi Sugiyama
Yanfeng Wang
FedML
44
11
0
29 May 2024
Domain-Inspired Sharpness-Aware Minimization Under Domain Shifts
Domain-Inspired Sharpness-Aware Minimization Under Domain Shifts
Ruipeng Zhang
Ziqing Fan
Jiangchao Yao
Ya-Qin Zhang
Yanfeng Wang
38
7
0
29 May 2024
Improving Generalization of Deep Neural Networks by Optimum Shifting
Improving Generalization of Deep Neural Networks by Optimum Shifting
Yuyan Zhou
Ye Li
Lei Feng
Sheng-Jun Huang
OOD
ODL
38
0
0
23 May 2024
SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data
SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data
Sakshi Choudhary
Sai Aparna Aketi
Kaushik Roy
FedML
45
0
0
22 May 2024
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural
  Networks
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks
Xin-Chun Li
Lan Li
De-Chuan Zhan
35
2
0
21 May 2024
Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks
Exploring and Exploiting the Asymmetric Valley of Deep Neural Networks
Xin-Chun Li
Jinli Tang
Bo Zhang
Lan Li
De-Chuan Zhan
49
2
0
21 May 2024
CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz
  continuity constrAIned Normalization
CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization
Yao Ni
Piotr Koniusz
AI4CE
GAN
40
1
0
31 Mar 2024
Revisiting Random Weight Perturbation for Efficiently Improving
  Generalization
Revisiting Random Weight Perturbation for Efficiently Improving Generalization
Tao Li
Qinghua Tao
Weihao Yan
Zehao Lei
Yingwen Wu
Kun Fang
M. He
Xiaolin Huang
AAML
39
5
0
30 Mar 2024
A Unified and General Framework for Continual Learning
A Unified and General Framework for Continual Learning
Zhenyi Wang
Yan Li
Li Shen
Heng-Chiao Huang
CLL
32
17
0
20 Mar 2024
Beyond Single-Model Views for Deep Learning: Optimization versus
  Generalizability of Stochastic Optimization Algorithms
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms
Toki Tahmid Inan
Mingrui Liu
Amarda Shehu
32
0
0
01 Mar 2024
Mirror Gradient: Towards Robust Multimodal Recommender Systems via
  Exploring Flat Local Minima
Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima
Shan Zhong
Zhongzhan Huang
Daifeng Li
Wushao Wen
Jinghui Qin
Liang Lin
22
12
0
17 Feb 2024
Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection
Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection
Chao Chen
Zhihang Fu
Kai-Chun Liu
Ze Chen
Mingyuan Tao
Jieping Ye
OODD
33
3
0
04 Feb 2024
Neglected Hessian component explains mysteries in Sharpness
  regularization
Neglected Hessian component explains mysteries in Sharpness regularization
Yann N. Dauphin
Atish Agarwala
Hossein Mobahi
FAtt
43
7
0
19 Jan 2024
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift
Renchunzi Xie
Ambroise Odonnat
Vasilii Feofanov
I. Redko
Jianfeng Zhang
Bo An
UQCV
80
1
0
17 Jan 2024
Stabilizing Sharpness-aware Minimization Through A Simple
  Renormalization Strategy
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy
Chengli Tan
Jiangshe Zhang
Junmin Liu
Yicheng Wang
Yunda Hao
AAML
34
1
0
14 Jan 2024
CR-SAM: Curvature Regularized Sharpness-Aware Minimization
CR-SAM: Curvature Regularized Sharpness-Aware Minimization
Tao Wu
Tie Luo
D. C. Wunsch
18
3
0
21 Dec 2023
Efficient Expansion and Gradient Based Task Inference for Replay Free
  Incremental Learning
Efficient Expansion and Gradient Based Task Inference for Replay Free Incremental Learning
Soumya Roy
Vinay K. Verma
Deepak Gupta
CLL
39
2
0
02 Dec 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
44
1
0
29 Nov 2023
FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness
  for Semi-Supervised Learning
FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness for Semi-Supervised Learning
Zhuo Huang
Li Shen
Jun-chen Yu
Bo Han
Tongliang Liu
FedML
29
21
0
25 Oct 2023
Winning Prize Comes from Losing Tickets: Improve Invariant Learning by
  Exploring Variant Parameters for Out-of-Distribution Generalization
Winning Prize Comes from Losing Tickets: Improve Invariant Learning by Exploring Variant Parameters for Out-of-Distribution Generalization
Zhuo Huang
Muyang Li
Li Shen
Jun-chen Yu
Chen Gong
Bo Han
Tongliang Liu
OOD
46
8
0
25 Oct 2023
Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Why Does Sharpness-Aware Minimization Generalize Better Than SGD?
Zixiang Chen
Junkai Zhang
Yiwen Kou
Xiangning Chen
Cho-Jui Hsieh
Quanquan Gu
32
13
0
11 Oct 2023
Asymmetrically Decentralized Federated Learning
Asymmetrically Decentralized Federated Learning
Qinglun Li
Miao Zhang
Nan Yin
Quanjun Yin
Li Shen
FedML
29
4
0
08 Oct 2023
Small batch deep reinforcement learning
Small batch deep reinforcement learning
J. Obando-Ceron
Marc G. Bellemare
Pablo Samuel Castro
VLM
34
14
0
05 Oct 2023
Enhancing Sharpness-Aware Optimization Through Variance Suppression
Enhancing Sharpness-Aware Optimization Through Variance Suppression
Bingcong Li
G. Giannakis
AAML
23
19
0
27 Sep 2023
Accelerating Large Batch Training via Gradient Signal to Noise Ratio
  (GSNR)
Accelerating Large Batch Training via Gradient Signal to Noise Ratio (GSNR)
Guo-qing Jiang
Jinlong Liu
Zixiang Ding
Lin Guo
W. Lin
AI4CE
24
1
0
24 Sep 2023
On the Implicit Bias of Adam
On the Implicit Bias of Adam
M. D. Cattaneo
Jason M. Klusowski
Boris Shigida
31
17
0
31 Aug 2023
12
Next