ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17331
  4. Cited By
ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training

ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training

22 May 2025
Maryam Dialameh
Rezaul Karim
Hossein Rajabzadeh
Omar Mohamed Awad
Hyock Ju Kwon
Boxing Chen
Walid Ahmed
Yang Liu
ArXivPDFHTML

Papers citing "ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training"

8 / 8 papers shown
Title
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values
Farnoosh Javadi
Walid Ahmed
Habib Hajimolahoseini
Foozhan Ataiefard
Mohammad Hassanpour
Saina Asani
Austin Wen
Omar Mohamed Awad
Kangling Liu
Yang Liu
VLM
72
8
0
06 Nov 2023
Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization
Training Acceleration of Low-Rank Decomposed Networks using Sequential Freezing and Rank Quantization
Habib Hajimolahoseini
Walid Ahmed
Yang Liu
OffRL
MQ
44
7
0
07 Sep 2023
Program Synthesis with Large Language Models
Program Synthesis with Large Language Models
Jacob Austin
Augustus Odena
Maxwell Nye
Maarten Bosma
Henryk Michalewski
...
Ellen Jiang
Carrie J. Cai
Michael Terry
Quoc V. Le
Charles Sutton
ELM
AIMat
ReCod
ALM
90
1,893
0
16 Aug 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
155
5,328
0
07 Jul 2021
Aligning AI With Shared Human Values
Aligning AI With Shared Human Values
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jingkai Li
D. Song
Jacob Steinhardt
112
540
0
05 Aug 2020
Longformer: The Long-Document Transformer
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
95
3,996
0
10 Apr 2020
Sigmoid-Weighted Linear Units for Neural Network Function Approximation
  in Reinforcement Learning
Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning
Stefan Elfwing
E. Uchibe
Kenji Doya
57
1,702
0
10 Feb 2017
Overcoming catastrophic forgetting in neural networks
Overcoming catastrophic forgetting in neural networks
J. Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
J. Veness
Guillaume Desjardins
...
A. Grabska-Barwinska
Demis Hassabis
Claudia Clopath
D. Kumaran
R. Hadsell
CLL
265
7,410
0
02 Dec 2016
1