Multi-objective Large Language Model Alignment with Hierarchical Experts

27 May 2025

Abstract

Aligning large language models (LLMs) to simultaneously satisfy multiple objectives remains a significant challenge, especially given the diverse and often conflicting nature of human preferences. Existing alignment methods struggle to balance trade-offs effectively, often requiring costly retraining or yielding suboptimal results across the Pareto frontier of preferences. In this paper, we introduce \textit{HoE}(Hierarchical Mixture-of-Experts), a \textit{lightweight}, \textit{parameter-efficient}, and \textit{plug-and-play} approach that eliminates the need for model training, while enabling LLMs to adapt across the entire Pareto frontier and accommodate diverse user preferences. In particular, \textit{HoE} consists of three hierarchical components: LoRA Experts, Router Experts and Preference Routing, reaching optimal Pareto frontiers and achieving a trade-off between parameter size, training cost, and performance. We evaluate \textit{HoE} across various tasks on 14 objectives and 200 different preferences among 6 benchmarks, demonstrating superior performance over 15 recent baselines. Code is available in the supplementary materials.

View on arXiv

@article{li2025_2505.20925,
  title={ Multi-objective Large Language Model Alignment with Hierarchical Experts },
  author={ Zhuo Li and Guodong Du and Weiyang Guo and Yigeng Zhou and Xiucheng Li and Wenya Wang and Fangming Liu and Yequan Wang and Deheng Ye and Min Zhang and Jing Li },
  journal={arXiv preprint arXiv:2505.20925},
  year={ 2025 }
}

Comments on this paper