ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.06954
39
1

A Hessian-informed hyperparameter optimization for differential learning rate

12 January 2025
Shiyun Xu
Zhiqi Bu
Yiliang Zhang
Ian Barnett
ArXivPDFHTML
Abstract

Differential learning rate (DLR), a technique that applies different learning rates to different model parameters, has been widely used in deep learning and achieved empirical success via its various forms. For example, parameter-efficient fine-tuning (PEFT) applies zero learning rates to most parameters so as to significantly save the computational cost.At the core, DLR leverages the observation that different parameters can have different loss curvature, which is hard to characterize in general. We propose the Hessian-informed differential learning rate (Hi-DLR), an efficient approach that solves the hyperparameter optimization (HPO) of learning rates and captures the loss curvature for any model and optimizer adaptively. Given a proper grouping of parameters, we empirically demonstrate that Hi-DLR can improve the convergence by dynamically determining the learning rates during the training.

View on arXiv
@article{xu2025_2501.06954,
  title={ A Hessian-informed hyperparameter optimization for differential learning rate },
  author={ Shiyun Xu and Zhiqi Bu and Yiliang Zhang and Ian Barnett },
  journal={arXiv preprint arXiv:2501.06954},
  year={ 2025 }
}
Comments on this paper