ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.17141
  4. Cited By
MetaAligner: Towards Generalizable Multi-Objective Alignment of Language
  Models
v1v2 (latest)

MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models

25 March 2024
Kailai Yang
Zhiwei Liu
Qianqian Xie
Jimin Huang
Tianlin Zhang
Sophia Ananiadou
ArXiv (abs)PDFHTML

Papers citing "MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models"

15 / 15 papers shown
Title
Multi-objective Large Language Model Alignment with Hierarchical Experts
Multi-objective Large Language Model Alignment with Hierarchical Experts
Zhuo Li
Guodong DU
Weiyang Guo
Yigeng Zhou
Xiucheng Li
...
Fangming Liu
Yequan Wang
Deheng Ye
Min Zhang
Jing Li
ALMMoE
77
0
0
27 May 2025
SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety
SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety
Geon-hyeong Kim
Youngsoo Jang
Yu Jin Kim
Byoungjip Kim
Honglak Lee
Kyunghoon Bae
Moontae Lee
24
2
0
26 May 2025
WikiPersonas: What Can We Learn From Personalized Alignment to Famous People?
WikiPersonas: What Can We Learn From Personalized Alignment to Famous People?
Zilu Tang
Afra Feyza Akyürek
Ekin Akyürek
Derry Wijaya
112
0
0
19 May 2025
PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model
PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model
Baijiong Lin
Weisen Jiang
Yuancheng Xu
Hao Chen
Ying-Cong Chen
86
1
0
06 May 2025
ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data
ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data
Haoran Gu
Handing Wang
Yi Mei
Mengjie Zhang
Yaochu Jin
75
1
0
23 Apr 2025
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment
Xiaotian Zhang
Ruizhe Chen
Yang Feng
Zuozhu Liu
109
2
0
17 Apr 2025
A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models
A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models
Zhouhang Xie
Junda Wu
Yiran Shen
Yu Xia
Xintong Li
...
Sachin Kumar
Bodhisattwa Prasad Majumder
Jingbo Shang
Prithviraj Ammanabrolu
Julian McAuley
158
1
0
09 Apr 2025
A Domain-Based Taxonomy of Jailbreak Vulnerabilities in Large Language Models
A Domain-Based Taxonomy of Jailbreak Vulnerabilities in Large Language Models
Carlos Peláez-González
Andrés Herrera-Poyatos
Cristina Zuheros
David Herrera-Poyatos
Virilo Tejedor
F. Herrera
AAML
78
0
0
07 Apr 2025
A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World Applications
A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World Applications
Jian Guan
Jian Wu
Jia-Nan Li
Chuanqi Cheng
Wei Wu
LM&MA
171
3
0
21 Mar 2025
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
Jia-Nan Li
Jian Guan
Songhao Wu
Wei Wu
Rui Yan
169
3
0
19 Mar 2025
DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models
DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models
Ruizhe Chen
Wenhao Chai
Zhifei Yang
Xiaotian Zhang
Qiufeng Wang
Tony Q.S. Quek
Soujanya Poria
Zuozhu Liu
133
1
0
06 Mar 2025
Drift: Decoding-time Personalized Alignments with Implicit User Preferences
Drift: Decoding-time Personalized Alignments with Implicit User Preferences
Minbeom Kim
Kang-il Lee
Seongho Joo
Hwaran Lee
Thibaut Thonet
Kyomin Jung
AI4TS
204
1
0
20 Feb 2025
Multi-Attribute Steering of Language Models via Targeted Intervention
Multi-Attribute Steering of Language Models via Targeted Intervention
Duy Nguyen
Archiki Prasad
Elias Stengel-Eskin
Joey Tianyi Zhou
LLMSV
166
2
0
18 Feb 2025
Selective Preference Optimization via Token-Level Reward Function
  Estimation
Selective Preference Optimization via Token-Level Reward Function Estimation
Kailai Yang
Zhiwei Liu
Qianqian Xie
Jimin Huang
Erxue Min
Sophia Ananiadou
87
13
0
24 Aug 2024
Quantifying Misalignment Between Agents: Towards a Sociotechnical Understanding of Alignment
Quantifying Misalignment Between Agents: Towards a Sociotechnical Understanding of Alignment
Aidan Kierans
Avijit Ghosh
Hananel Hazan
Shiri Dori-Hacohen
57
1
0
06 Jun 2024
1