Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.09390
Cited By
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
14 December 2023
Collin Burns
Pavel Izmailov
Jan Hendrik Kirchner
Bowen Baker
Leo Gao
Leopold Aschenbrenner
Yining Chen
Adrien Ecoffet
Manas Joglekar
Jan Leike
Ilya Sutskever
Jeff Wu
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision"
18 / 68 papers shown
Title
JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability
Junda Wang
Zhichao Yang
Zonghai Yao
Hong-ye Yu
BDL
AI4MH
LRM
40
30
0
27 Feb 2024
Towards Unified Alignment Between Agents, Humans, and Environment
Zonghan Yang
An Liu
Zijun Liu
Kai Liu
Fangzhou Xiong
...
Zhenhe Zhang
Fuwen Luo
Zhicheng Guo
Peng Li
Yang Liu
32
4
0
12 Feb 2024
Navigating the OverKill in Large Language Models
Chenyu Shi
Xiao Wang
Qiming Ge
Songyang Gao
Xianjun Yang
Tao Gui
Qi Zhang
Xuanjing Huang
Xun Zhao
Dahua Lin
21
11
0
31 Jan 2024
Scheming AIs: Will AIs fake alignment during training in order to get power?
Joe Carlsmith
67
30
0
14 Nov 2023
Learning Evaluation Models from Large Language Models for Sequence Generation
Chenglong Wang
Hang Zhou
Kai-Chun Chang
Tongran Liu
Chunliang Zhang
Quan Du
Tong Xiao
Yue Zhang
Jingbo Zhu
ELM
40
3
0
08 Aug 2023
Evolutionary Generalized Zero-Shot Learning
Dubing Chen
Haofeng Zhang
Yang Long
VLM
34
1
0
23 Nov 2022
Exploring The Landscape of Distributional Robustness for Question Answering Models
Anas Awadalla
Mitchell Wortsman
Gabriel Ilharco
Sewon Min
Ian H. Magnusson
Hannaneh Hajishirzi
Ludwig Schmidt
ELM
OOD
KELM
72
19
0
22 Oct 2022
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
227
502
0
28 Sep 2022
An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation
Ziquan Liu
Yi Tian Xu
Yuanhong Xu
Qi Qian
Hao Li
Rong Jin
Xiangyang Ji
Antoni B. Chan
OOD
45
14
0
25 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Diversify and Disambiguate: Learning From Underspecified Data
Yoonho Lee
Huaxiu Yao
Chelsea Finn
213
64
0
07 Feb 2022
Editing a classifier by rewriting its prediction rules
Shibani Santurkar
Dimitris Tsipras
Mahalaxmi Elango
David Bau
Antonio Torralba
A. Madry
KELM
180
89
0
02 Dec 2021
Truthful AI: Developing and governing AI that does not lie
Owain Evans
Owen Cotton-Barratt
Lukas Finnveden
Adam Bales
Avital Balwit
Peter Wills
Luca Righetti
William Saunders
HILM
236
109
0
13 Oct 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
317
5,785
0
29 Apr 2021
SWAD: Domain Generalization by Seeking Flat Minima
Junbum Cha
Sanghyuk Chun
Kyungjae Lee
Han-Cheol Cho
Seunghyun Park
Yunsung Lee
Sungrae Park
MoMe
216
423
0
17 Feb 2021
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
204
200
0
02 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,198
0
01 Sep 2014
Previous
1
2