ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.05449
  4. Cited By
Investigating Alternatives to the Root Mean Square for Adaptive Gradient
  Methods

Investigating Alternatives to the Root Mean Square for Adaptive Gradient Methods

10 June 2021
Brett Daley
Chris Amato
    ODL
ArXivPDFHTML

Papers citing "Investigating Alternatives to the Root Mean Square for Adaptive Gradient Methods"

1 / 1 papers shown
Title
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,833
0
17 Sep 2019
1