ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.02664
  4. Cited By
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for
  Training Large Transformer Models

No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models

6 February 2022
Chen Liang
Haoming Jiang
Simiao Zuo
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
T. Zhao
ArXivPDFHTML

Papers citing "No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models"

4 / 4 papers shown
Title
Learning List-Level Domain-Invariant Representations for Ranking
Learning List-Level Domain-Invariant Representations for Ranking
Ruicheng Xian
Honglei Zhuang
Zhen Qin
Hamed Zamani
Jing Lu
Ji Ma
Kai Hui
Han Zhao
Xuanhui Wang
Michael Bendersky
OOD
54
9
0
21 Dec 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding
  Language Models with Model Generated Signals
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
52
32
0
13 Apr 2022
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
156
377
0
23 Jul 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,996
0
20 Apr 2018
1