ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.16601
  4. Cited By
An Efficient Sparse Inference Software Accelerator for Transformer-based
  Language Models on CPUs

An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

28 June 2023
Haihao Shen
Hengyu Meng
Bo Dong
Zhe Wang
Ofir Zafrir
Yi Ding
Yunqian Luo
Hanwen Chang
Qun Gao
Zi. Wang
Guy Boudoukh
Moshe Wasserblat
    MoE
ArXivPDFHTML

Papers citing "An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs"

2 / 2 papers shown
Title
Model Compression and Efficient Inference for Large Language Models: A
  Survey
Model Compression and Efficient Inference for Large Language Models: A Survey
Wenxiao Wang
Wei Chen
Yicong Luo
Yongliu Long
Zhengkai Lin
Liye Zhang
Binbin Lin
Deng Cai
Xiaofei He
MQ
41
48
0
15 Feb 2024
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
107
344
0
05 Jan 2021
1