ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.13405
24
4

Trinity: A General Purpose FHE Accelerator

17 October 2024
Xianglong Deng
Shengyu Fan
Zhicheng Hu
Zhuoyu Tian
Zihao Yang
Jiangrui Yu
Dingyuan Cao
Dan Meng
Rui Hou
Meng Li
Qian Lou
Mingzhe Zhang
ArXivPDFHTML
Abstract

In this paper, we present the first multi-modal FHE accelerator based on a unified architecture, which efficiently supports CKKS, TFHE, and their conversion scheme within a single accelerator. To achieve this goal, we first analyze the theoretical foundations of the aforementioned schemes and highlight their composition from a finite number of arithmetic kernels. Then, we investigate the challenges for efficiently supporting these kernels within a unified architecture, which include 1) concurrent support for NTT and FFT, 2) maintaining high hardware utilization across various polynomial lengths, and 3) ensuring consistent performance across diverse arithmetic kernels. To tackle these challenges, we propose a novel FHE accelerator named Trinity, which incorporates algorithm optimizations, hardware component reuse, and dynamic workload scheduling to enhance the acceleration of CKKS, TFHE, and their conversion scheme. By adaptive select the proper allocation of components for NTT and MAC, Trinity maintains high utilization across NTTs with various polynomial lengths and imbalanced arithmetic workloads. The experiment results show that, for the pure CKKS and TFHE workloads, the performance of our Trinity outperforms the state-of-the-art accelerator for CKKS (SHARP) and TFHE (Morphling) by 1.49x and 4.23x, respectively. Moreover, Trinity achieves 919.3x performance improvement for the FHE-conversion scheme over the CPU-based implementation. Notably, despite the performance improvement, the hardware overhead of Trinity is only 85% of the summed circuit areas of SHARP and Morphling.

View on arXiv
Comments on this paper