Singular Value Decomposition on Kronecker Adaptation for Large Language Model

18 June 2025

Yee Hin Chong

Peng Qu

ArXiv (abs)PDF HTML

Main:3 Pages

2 Figures

Bibliography:2 Pages

1 Tables

Abstract

Large pre-trained Transformer models achieve state-of-the-art results across diverse language and reasoning tasks, but full fine-tuning incurs substantial storage, memory, and computational overhead. Parameter-efficient fine-tuning (PEFT) methods mitigate these costs by learning only a small subset of task-specific parameters, yet existing approaches either introduce inference-time latency (adapter modules), suffer from suboptimal convergence (randomly initialized low-rank updates), or rely on fixed rank choices that may not match task complexity (Kronecker-based decompositions).We propose SoKA (SVD on Kronecker Adaptation), a novel PEFT strategy that combines Kronecker-product tensor factorization with SVD-driven initialization and spectrum-aware dynamic rank selection. Our Kronecker-Product SVD (KPSVD) procedure extracts principal components of the full weight update into compact Kronecker factors, while an adaptive rank selection algorithm uses energy-threshold and elbow-point criteria to prune negligible components.Empirical evaluation on LLaMA2-7B across arithmetic reasoning (GSM8K), formal mathematics (MATH), and code generation (MBPP) demonstrates that SoKA requires only 0.99M trainable parameters, 25% fewer than LoRA/PiSSA, while matching or exceeding baseline performance. Moreover, SoKA exhibits faster convergence and more stable gradients, highlighting its robustness and efficiency for large-scale model adaptation.

View on arXiv

@article{chong2025_2506.15251,
  title={ Singular Value Decomposition on Kronecker Adaptation for Large Language Model },
  author={ Yee Hin Chong and Peng Qu },
  journal={arXiv preprint arXiv:2506.15251},
  year={ 2025 }
}

Comments on this paper