Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.12533
Cited By
PUMA: Secure Inference of LLaMA-7B in Five Minutes
24 July 2023
Ye Dong
Wen-jie Lu
Yancheng Zheng
Haoqi Wu
Derun Zhao
Jin Tan
Zhicong Huang
Cheng Hong
Tao Wei
Wen-Chang Cheng
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PUMA: Secure Inference of LLaMA-7B in Five Minutes"
29 / 29 papers shown
Title
Private Transformer Inference in MLaaS: A Survey
Yang Li
Xinyu Zhou
Yishuo Wang
Liangxin Qian
Jun Zhao
21
0
0
15 May 2025
Comet: Accelerating Private Inference for Large Language Model by Predicting Activation Sparsity
Guang Yan
Yuhui Zhang
Zimu Guo
Lutan Zhao
Xiaojun Chen
Chen Wang
Wenhao Wang
Dan Meng
Rui Hou
33
0
0
12 May 2025
Cape: Context-Aware Prompt Perturbation Mechanism with Differential Privacy
Haoqi Wu
Wei Dai
Li Wang
Qiang Yan
SILM
33
0
0
09 May 2025
Encryption-Friendly LLM Architecture
Donghwan Rho
Taeseong Kim
Minje Park
Jung Woo Kim
Hyunsik Chae
Jung Hee Cheon
Ernest K. Ryu
57
2
0
24 Feb 2025
CipherPrune: Efficient and Scalable Private Transformer Inference
Yancheng Zhang
Jinbao Xue
Mengxin Zheng
Mimi Xie
Mingzhe Zhang
Lei Jiang
Qian Lou
59
2
0
24 Feb 2025
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private Large Language Model Inference
Wenxuan Zeng
Ye Dong
Jinjin Zhou
Junming Ma
Jin Tan
Runsheng Wang
Meng Li
49
0
0
12 Jan 2025
TruncFormer: Private LLM Inference Using Only Truncations
Patrick Yubeaton
Jianqiao Mo
Karthik Garimella
N. Jha
Brandon Reagen
Chinmay Hegde
Siddharth Garg
74
0
0
02 Dec 2024
Nimbus: Secure and Efficient Two-Party Inference for Transformers
Zhengyi Li
Kang Yang
Jin Tan
Wen-jie Lu
Haoqi Wu
...
Yu Yu
Derun Zhao
Yancheng Zheng
M. Guo
Jingwen Leng
72
2
0
24 Nov 2024
Probe-Me-Not: Protecting Pre-trained Encoders from Malicious Probing
Ruyi Ding
Tong Zhou
Lili Su
A. A. Ding
Xiaolin Xu
Yunsi Fei
AAML
66
1
0
19 Nov 2024
CipherDM: Secure Three-Party Inference for Diffusion Model Sampling
Xin Zhao
Xiaojun Chen
Xinyu Chen
He Li
Tingyu Fan
Zhendong Zhao
34
1
0
09 Sep 2024
MPC-Minimized Secure LLM Inference
Deevashwer Rathee
Dacheng Li
Ion Stoica
Hao Zhang
Raluca A. Popa
39
1
0
07 Aug 2024
TensorTEE: Unifying Heterogeneous TEE Granularity for Efficient Secure Collaborative Tensor Computing
Husheng Han
Xinyao Zheng
Yuanbo Wen
Yifan Hao
Erhu Feng
...
Pengwei Jin
Xinkai Song
Zidong Du
Qi Guo
Xing Hu
30
0
0
12 Jul 2024
PDSS: A Privacy-Preserving Framework for Step-by-Step Distillation of Large Language Models
Tao Fan
Yan Kang
Weijing Chen
Hanlin Gu
Yuanfeng Song
Lixin Fan
Kai Chen
Qiang Yang
27
0
0
18 Jun 2024
Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey
Shang Wang
Tianqing Zhu
Bo Liu
Ming Ding
Xu Guo
Dayong Ye
Wanlei Zhou
Philip S. Yu
PILM
67
17
0
12 Jun 2024
Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas
Chengyuan Deng
Yiqun Duan
Xin Jin
Heng Chang
Yijun Tian
...
Kuofeng Gao
Sihong He
Jun Zhuang
Lu Cheng
Haohan Wang
AILaw
40
16
0
08 Jun 2024
PermLLM: Private Inference of Large Language Models within 3 Seconds under WAN
Fei Zheng
Chaochao Chen
Zhongxuan Han
Xiaolin Zheng
LRM
37
4
0
29 May 2024
Ditto: Quantization-aware Secure Inference of Transformers upon MPC
Haoqi Wu
Wenjing Fang
Yancheng Zheng
Junming Ma
Jin Tan
Yinggui Wang
Lei Wang
MQ
45
2
0
09 May 2024
A Framework for Cost-Effective and Self-Adaptive LLM Shaking and Recovery Mechanism
Zhiyuan Chen
Yu Li
Suochao Zhang
Jingbo Zhou
Jiwen Zhou
Chenfu Bao
Dianhai Yu
26
0
0
12 Mar 2024
On Protecting the Data Privacy of Large Language Models (LLMs): A Survey
Biwei Yan
Kun Li
Minghui Xu
Yueyan Dong
Yue Zhang
Zhaochun Ren
Xiuzhen Cheng
AILaw
PILM
70
76
0
08 Mar 2024
Spin: An Efficient Secure Computation Framework with GPU Acceleration
Wuxuan Jiang
Xiangjun Song
Shenbai Hong
Haijun Zhang
Wenxin Liu
Bo Zhao
Wei Xu
Yi Li
20
1
0
04 Feb 2024
SecFormer: Towards Fast and Accurate Privacy-Preserving Inference for Large Language Models
Jinglong Luo
Yehong Zhang
Zhuo Zhang
Jiaqi Zhang
Xin Mu
Hui Wang
Yue Yu
Zenglin Xu
40
9
0
01 Jan 2024
Grounding Foundation Models through Federated Transfer Learning: A General Framework
Yan Kang
Tao Fan
Hanlin Gu
Xiaojin Zhang
Lixin Fan
Qiang Yang
AI4CE
68
19
0
29 Nov 2023
Input Reconstruction Attack against Vertical Federated Large Language Models
Fei Zheng
FedML
19
6
0
07 Nov 2023
Λ
Λ
Λ
-Split: A Privacy-Preserving Split Computing Framework for Cloud-Powered Generative AI
Shoki Ohta
Takayuki Nishio
62
4
0
23 Oct 2023
Privacy in Large Language Models: Attacks, Defenses and Future Directions
Haoran Li
Yulin Chen
Jinglong Luo
Yan Kang
Xiaojin Zhang
Qi Hu
Chunkit Chan
Yangqiu Song
PILM
45
42
0
16 Oct 2023
East: Efficient and Accurate Secure Transformer Framework for Inference
Yuanchao Ding
Hua Guo
Yewei Guan
Weixin Liu
Jiarong Huo
Zhenyu Guan
Xiyong Zhang
14
17
0
19 Aug 2023
CryptGPU: Fast Privacy-Preserving Machine Learning on the GPU
Sijun Tan
Brian Knott
Yuan Tian
David J. Wu
BDL
FedML
57
183
0
22 Apr 2021
CrypTFlow2: Practical 2-Party Secure Inference
Deevashwer Rathee
Mayank Rathee
Nishant Kumar
Nishanth Chandran
Divya Gupta
Aseem Rastogi
Rahul Sharma
87
301
0
13 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1