ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.01273
  4. Cited By
NoMAD-Attention: Efficient LLM Inference on CPUs Through
  Multiply-add-free Attention

NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention

2 March 2024
Tianyi Zhang
Jonah Yi
Bowen Yao
Zhaozhuo Xu
Anshumali Shrivastava
    MQ
ArXiv (abs)PDFHTMLGithub (32★)

Papers citing "NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention"

2 / 2 papers shown
Title
Towards a Middleware for Large Language Models
Towards a Middleware for Large Language Models
Narcisa Guran
Florian Knauf
Man Ngo
Stefan Petrescu
Jan S. Rellermeyer
111
2
0
21 Nov 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
189
20
0
06 Oct 2024
1