ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.02832
  4. Cited By
AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation
v1v2 (latest)

AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation

4 March 2025
Songming Zhang
Xue Zhang
Tong Zhang
Bojie Hu
Yufeng Chen
Jinan Xu
ArXiv (abs)PDFHTML

Papers citing "AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation"

4 / 4 papers shown
Title
A Dual-Space Framework for General Knowledge Distillation of Large Language Models
A Dual-Space Framework for General Knowledge Distillation of Large Language Models
Wei Wei
Songming Zhang
Yunlong Liang
Fandong Meng
Yufeng Chen
Jinan Xu
Jie Zhou
124
0
0
15 Apr 2025
Latent Feature Mining for Predictive Model Enhancement with Large
  Language Models
Latent Feature Mining for Predictive Model Enhancement with Large Language Models
Bingxuan Li
Pengyi Shi
Amy Ward
130
11
0
06 Oct 2024
DPO Meets PPO: Reinforced Token Optimization for RLHF
DPO Meets PPO: Reinforced Token Optimization for RLHF
Han Zhong
Zikang Shan
Guhao Feng
Wei Xiong
Xinle Cheng
Li Zhao
Di He
Jiang Bian
Liwei Wang
155
72
0
29 Apr 2024
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Yann Dubois
Balázs Galambosi
Percy Liang
Tatsunori Hashimoto
ALM
173
403
0
06 Apr 2024
1