ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.06606
  4. Cited By
Prototypical Reward Network for Data-Efficient RLHF

Prototypical Reward Network for Data-Efficient RLHF

6 June 2024
Jinghan Zhang
Xiting Wang
Yiqiao Jin
Changyu Chen
Xinhao Zhang
Kunpeng Liu
    ALM
ArXivPDFHTML

Papers citing "Prototypical Reward Network for Data-Efficient RLHF"

11 / 11 papers shown
Title
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Bing Wang
Rui Zheng
Luyao Chen
Yan Liu
Shihan Dou
...
Qi Zhang
Xipeng Qiu
Xuanjing Huang
Zuxuan Wu
Yuanyuan Jiang
ALM
86
106
0
11 Jan 2024
RRHF: Rank Responses to Align Language Models with Human Feedback
  without tears
RRHF: Rank Responses to Align Language Models with Human Feedback without tears
Zheng Yuan
Hongyi Yuan
Chuanqi Tan
Wei Wang
Songfang Huang
Feiran Huang
ALM
140
369
0
11 Apr 2023
Constitutional AI: Harmlessness from AI Feedback
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
168
1,603
0
15 Dec 2022
Few-shot Named Entity Recognition with Entity-level Prototypical Network
  Enhanced by Dispersedly Distributed Prototypes
Few-shot Named Entity Recognition with Entity-level Prototypical Network Enhanced by Dispersedly Distributed Prototypes
Shezheng Song
Shasha Li
Shaoduo Gan
Jie Yu
Jun Ma
Bin Ji
43
32
0
17 Aug 2022
The Online Pivot: Lessons Learned from Teaching a Text and Data Mining
  Course in Lockdown, Enhancing online Teaching with Pair Programming and
  Digital Badges
The Online Pivot: Lessons Learned from Teaching a Text and Data Mining Course in Lockdown, Enhancing online Teaching with Pair Programming and Digital Badges
Beatrice Alex
Claire Llewellyn
P. Orzechowski
Maria Boutchkova
26
2
0
03 May 2021
Graph Prototypical Networks for Few-shot Learning on Attributed Networks
Graph Prototypical Networks for Few-shot Learning on Attributed Networks
Kaize Ding
Jianling Wang
Jundong Li
Kai Shu
Chenghao Liu
Huan Liu
48
162
0
23 Jun 2020
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
449
1,717
0
18 Sep 2019
Interpretable and Steerable Sequence Learning via Prototypes
Interpretable and Steerable Sequence Learning via Prototypes
Yao Ming
Panpan Xu
Huamin Qu
Liu Ren
AI4TS
47
141
0
23 Jul 2019
Unified Language Model Pre-training for Natural Language Understanding
  and Generation
Unified Language Model Pre-training for Natural Language Understanding and Generation
Li Dong
Nan Yang
Wenhui Wang
Furu Wei
Xiaodong Liu
Yu Wang
Jianfeng Gao
M. Zhou
H. Hon
ELM
AI4CE
182
1,554
0
08 May 2019
A Deep Reinforced Model for Abstractive Summarization
A Deep Reinforced Model for Abstractive Summarization
Romain Paulus
Caiming Xiong
R. Socher
AI4TS
173
1,556
0
11 May 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
149
1,530
0
25 Jan 2017
1