From Lists to Emojis: How Format Bias Affects Model Alignment

From Lists to Emojis: How Format Bias Affects Model Alignment

Papers citing "From Lists to Emojis: How Format Bias Affects Model Alignment"

50 / 61 papers shown
Title
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in
  Alignment
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
Keming Lu
Bowen Yu
Fei Huang
Yang Fan
Runji Lin
Chang Zhou
61
19
0
28 May 2024

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.