ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.05782
  4. Cited By
Aligning Language Models with Human Preferences via a Bayesian Approach

Aligning Language Models with Human Preferences via a Bayesian Approach

9 October 2023
Jiashuo Wang
Haozhao Wang
Shichao Sun
Wenjie Li
    ALM
ArXivPDFHTML

Papers citing "Aligning Language Models with Human Preferences via a Bayesian Approach"

21 / 21 papers shown
Title
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
Kai Ye
Hongyi Zhou
Jin Zhu
Francesco Quinzan
C. Shi
32
1
0
03 Apr 2025
Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models
Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models
M. Wong
C. Tan
ALM
83
4
0
19 Mar 2025
Modeling Subjectivity in Cognitive Appraisal with Language Models
Yuxiang Zhou
Hainiu Xu
Desmond C. Ong
Petr Slovak
Yulan He
41
0
0
14 Mar 2025
AI Alignment at Your Discretion
AI Alignment at Your Discretion
Maarten Buyl
Hadi Khalaf
C. M. Verdun
Lucas Monteiro Paes
Caio Vieira Machado
Flavio du Pin Calmon
45
0
0
10 Feb 2025
Geometric-Averaged Preference Optimization for Soft Preference Labels
Geometric-Averaged Preference Optimization for Soft Preference Labels
Hiroki Furuta
Kuang-Huei Lee
Shixiang Shane Gu
Y. Matsuo
Aleksandra Faust
Heiga Zen
Izzeddin Gur
58
7
0
31 Dec 2024
Optimizing Preference Alignment with Differentiable NDCG Ranking
Optimizing Preference Alignment with Differentiable NDCG Ranking
Jiacong Zhou
Xianyun Wang
Jun Yu
32
2
0
17 Oct 2024
E2CL: Exploration-based Error Correction Learning for Embodied Agents
E2CL: Exploration-based Error Correction Learning for Embodied Agents
Hanlin Wang
Chak Tou Leong
Jian Wang
Wenjie Li
37
1
0
05 Sep 2024
StyEmp: Stylizing Empathetic Response Generation via Multi-Grained
  Prefix Encoder and Personality Reinforcement
StyEmp: Stylizing Empathetic Response Generation via Multi-Grained Prefix Encoder and Personality Reinforcement
Yahui Fu
Chenhui Chu
Tatsuya Kawahara
42
2
0
05 Aug 2024
A SMART Mnemonic Sounds like "Glue Tonic": Mixing LLMs with Student
  Feedback to Make Mnemonic Learning Stick
A SMART Mnemonic Sounds like "Glue Tonic": Mixing LLMs with Student Feedback to Make Mnemonic Learning Stick
Nishant Balepur
Matthew Shu
Alexander Hoyle
Alison Robey
Shi Feng
Seraphina Goldfarb-Tarrant
Jordan Boyd-Graber
44
2
0
21 Jun 2024
A Survey on Human Preference Learning for Large Language Models
A Survey on Human Preference Learning for Large Language Models
Ruili Jiang
Kehai Chen
Xuefeng Bai
Zhixuan He
Juntao Li
Muyun Yang
Tiejun Zhao
Liqiang Nie
Min Zhang
49
8
0
17 Jun 2024
Aligning to Thousands of Preferences via System Message Generalization
Aligning to Thousands of Preferences via System Message Generalization
Seongyun Lee
Sue Hyun Park
Seungone Kim
Minjoon Seo
ALM
44
38
0
28 May 2024
Embedding-Aligned Language Models
Embedding-Aligned Language Models
Guy Tennenholtz
Yinlam Chow
Chih-Wei Hsu
Lior Shani
Ethan Liang
Craig Boutilier
AIFin
37
1
0
24 May 2024
Towards Human-centered Proactive Conversational Agents
Towards Human-centered Proactive Conversational Agents
Yang Deng
Lizi Liao
Zhonghua Zheng
Grace Hui Yang
Tat-Seng Chua
LLMAG
40
25
0
19 Apr 2024
ROPO: Robust Preference Optimization for Large Language Models
ROPO: Robust Preference Optimization for Large Language Models
Xize Liang
Chao Chen
Shuang Qiu
Jie Wang
Yue-bo Wu
Zhihang Fu
Zhihao Shi
Feng Wu
Jieping Ye
48
1
0
05 Apr 2024
Mitigating Unhelpfulness in Emotional Support Conversations with
  Multifaceted AI Feedback
Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback
Jiashuo Wang
Chunpu Xu
Chak Tou Leong
Wenjie Li
Jing Li
38
1
0
11 Jan 2024
On Diversified Preferences of Large Language Model Alignment
On Diversified Preferences of Large Language Model Alignment
Dun Zeng
Yong Dai
Pengyu Cheng
Longyue Wang
Tianhao Hu
Wanshun Chen
Nan Du
Zenglin Xu
ALM
38
16
0
12 Dec 2023
Improving alignment of dialogue agents via targeted human judgements
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
230
506
0
28 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
363
12,003
0
04 Mar 2022
Trustworthy AI: From Principles to Practices
Trustworthy AI: From Principles to Practices
Bo-wen Li
Peng Qi
Bo Liu
Shuai Di
Jingen Liu
Jiquan Pei
Jinfeng Yi
Bowen Zhou
119
356
0
04 Oct 2021
Agreeing to Disagree: Annotating Offensive Language Datasets with
  Annotators' Disagreement
Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators' Disagreement
Elisa Leonardelli
Stefano Menini
Alessio Palmero Aprosio
Marco Guerini
Sara Tonelli
52
97
0
28 Sep 2021
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
301
1,610
0
18 Sep 2019
1