Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.01458
Cited By
The Capabilities and Limitations of Weak-to-Strong Generalization: Generalization and Calibration
3 February 2025
Wei Yao
Wenkai Yang
Ziyi Wang
Yankai Lin
Yong Liu
Yong Liu
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Capabilities and Limitations of Weak-to-Strong Generalization: Generalization and Calibration"
17 / 17 papers shown
Title
Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model
Wenhong Zhu
Zhiwei He
Xiaofeng Wang
Pengfei Liu
Rui Wang
OSLM
77
4
0
24 Oct 2024
Weak-to-Strong Generalization beyond Accuracy: a Pilot Study in Safety, Toxicity, and Legal Reasoning
Ruimeng Ye
Yang Xiao
Bo Hui
ALM
ELM
OffRL
92
3
0
16 Oct 2024
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Wenkai Yang
Shiqi Shen
Guangyao Shen
Zhi Gong
Yankai Lin
Zhi Gong
Yankai Lin
Ji-Rong Wen
82
15
0
17 Jun 2024
Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning
Jitao Sang
Yuhang Wang
Jing Zhang
Yanxu Zhu
Chao Kong
Junhong Ye
Shuyu Wei
Jinlin Xiao
78
10
0
01 Feb 2024
Generalization Bounds: Perspectives from Information Theory and PAC-Bayes
Fredrik Hellström
G. Durisi
Benjamin Guedj
Maxim Raginsky
37
36
0
08 Sep 2023
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
1.4K
14,313
0
15 Mar 2023
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
184
1,614
0
15 Dec 2022
A Close Look into the Calibration of Pre-trained Language Models
Yangyi Chen
Lifan Yuan
Ganqu Cui
Zhiyuan Liu
Heng Ji
121
51
0
31 Oct 2022
Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution
Ananya Kumar
Aditi Raghunathan
Robbie Jones
Tengyu Ma
Percy Liang
OODD
114
671
0
21 Feb 2022
Generalization Bounds For Meta-Learning: An Information-Theoretic Analysis
Qi Chen
Changjian Shui
M. Marchand
81
44
0
29 Sep 2021
Continual Learning in the Teacher-Student Setup: Impact of Task Similarity
Sebastian Lee
Sebastian Goldt
Andrew M. Saxe
CLL
67
74
0
09 Jul 2021
Knowledge distillation: A good teacher is patient and consistent
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
VLM
107
295
0
09 Jun 2021
Mitigating Bias in Calibration Error Estimation
Rebecca Roelofs
Nicholas Cain
Jonathon Shlens
Michael C. Mozer
69
95
0
15 Dec 2020
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
286
300
0
17 Mar 2020
Verified Uncertainty Calibration
Ananya Kumar
Percy Liang
Tengyu Ma
164
354
0
23 Sep 2019
Snorkel: Rapid Training Data Creation with Weak Supervision
Alexander Ratner
Stephen H. Bach
Henry R. Ehrenberg
Jason Alan Fries
Sen Wu
Christopher Ré
73
1,027
0
28 Nov 2017
Information-theoretic analysis of generalization capability of learning algorithms
Aolin Xu
Maxim Raginsky
166
446
0
22 May 2017
1