Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.06606
Cited By
Prototypical Reward Network for Data-Efficient RLHF
6 June 2024
Jinghan Zhang
Xiting Wang
Yiqiao Jin
Changyu Chen
Xinhao Zhang
Kunpeng Liu
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Prototypical Reward Network for Data-Efficient RLHF"
11 / 11 papers shown
Title
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Bing Wang
Rui Zheng
Luyao Chen
Yan Liu
Shihan Dou
...
Qi Zhang
Xipeng Qiu
Xuanjing Huang
Zuxuan Wu
Yuanyuan Jiang
ALM
86
106
0
11 Jan 2024
RRHF: Rank Responses to Align Language Models with Human Feedback without tears
Zheng Yuan
Hongyi Yuan
Chuanqi Tan
Wei Wang
Songfang Huang
Feiran Huang
ALM
140
369
0
11 Apr 2023
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
168
1,603
0
15 Dec 2022
Few-shot Named Entity Recognition with Entity-level Prototypical Network Enhanced by Dispersedly Distributed Prototypes
Shezheng Song
Shasha Li
Shaoduo Gan
Jie Yu
Jun Ma
Bin Ji
43
32
0
17 Aug 2022
The Online Pivot: Lessons Learned from Teaching a Text and Data Mining Course in Lockdown, Enhancing online Teaching with Pair Programming and Digital Badges
Beatrice Alex
Claire Llewellyn
P. Orzechowski
Maria Boutchkova
26
2
0
03 May 2021
Graph Prototypical Networks for Few-shot Learning on Attributed Networks
Kaize Ding
Jianling Wang
Jundong Li
Kai Shu
Chenghao Liu
Huan Liu
48
162
0
23 Jun 2020
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
449
1,717
0
18 Sep 2019
Interpretable and Steerable Sequence Learning via Prototypes
Yao Ming
Panpan Xu
Huamin Qu
Liu Ren
AI4TS
47
141
0
23 Jul 2019
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong
Nan Yang
Wenhui Wang
Furu Wei
Xiaodong Liu
Yu Wang
Jianfeng Gao
M. Zhou
H. Hon
ELM
AI4CE
182
1,554
0
08 May 2019
A Deep Reinforced Model for Abstractive Summarization
Romain Paulus
Caiming Xiong
R. Socher
AI4TS
173
1,556
0
11 May 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
149
1,530
0
25 Jan 2017
1