Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.15006
Cited By
Fine-tuning language models to find agreement among humans with diverse preferences
28 November 2022
Michiel A. Bakker
Martin Chadwick
Hannah R. Sheahan
Michael Henry Tessler
Lucy Campbell-Gillingham
Jan Balaguer
Nat McAleese
Amelia Glaese
John Aslanides
M. Botvinick
Christopher Summerfield
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fine-tuning language models to find agreement among humans with diverse preferences"
50 / 123 papers shown
Title
AgentsCoDriver: Large Language Model Empowered Collaborative Driving with Lifelong Learning
Senkang Hu
Zhengru Fang
Zihan Fang
Yiqin Deng
Xianhao Chen
Yuguang Fang
60
33
0
09 Apr 2024
Aligning Diffusion Models by Optimizing Human Utility
Shufan Li
Konstantinos Kallidromitis
Akash Gokul
Yusuke Kato
Kazuki Kozuka
107
29
0
06 Apr 2024
The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition
Georgios Chochlakis
Alexandros Potamianos
Kristina Lerman
Shrikanth Narayanan
40
5
0
25 Mar 2024
ChatGPT Incorrectness Detection in Software Reviews
M. Tanzil
Junaed Younus Khan
Gias Uddin
19
4
0
25 Mar 2024
MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models
Subash Neupane
Shaswata Mitra
Sudip Mittal
Noorbakhsh Amiri Golilarz
Shahram Rahimi
Amin Amirlatifi
64
3
0
13 Mar 2024
Provable Multi-Party Reinforcement Learning with Diverse Human Feedback
Huiying Zhong
Zhun Deng
Weijie J. Su
Zhiwei Steven Wu
Linjun Zhang
52
13
0
08 Mar 2024
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
Akila Wickramasekara
F. Breitinger
Mark Scanlon
52
8
0
29 Feb 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
Haoxiang Wang
Yong Lin
Wei Xiong
Rui Yang
Shizhe Diao
Shuang Qiu
Han Zhao
Tong Zhang
40
71
0
28 Feb 2024
SoFA: Shielded On-the-fly Alignment via Priority Rule Following
Xinyu Lu
Bowen Yu
Yaojie Lu
Hongyu Lin
Haiyang Yu
Le Sun
Xianpei Han
Yongbin Li
63
13
0
27 Feb 2024
Unintended Impacts of LLM Alignment on Global Representation
Michael Joseph Ryan
William B. Held
Diyi Yang
45
40
0
22 Feb 2024
Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia
Tzu-Sheng Kuo
Aaron L Halfaker
Zirui Cheng
Jiwoo Kim
Meng-Hsin Wu
Tongshuang Wu
Kenneth Holstein
Haiyi Zhu
62
21
0
21 Feb 2024
Large Language Models for Data Annotation: A Survey
Zhen Tan
Dawei Li
Song Wang
Alimohammad Beigi
Bohan Jiang
Amrita Bhattacharjee
Mansooreh Karami
Wenlin Yao
Lu Cheng
Huan Liu
SyDa
56
50
0
21 Feb 2024
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
Fengqing Jiang
Zhangchen Xu
Luyao Niu
Zhen Xiang
Bhaskar Ramasubramanian
Bo Li
Radha Poovendran
47
86
0
19 Feb 2024
Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
Ming Li
Jiuhai Chen
Lichang Chen
Dinesh Manocha
71
17
0
16 Feb 2024
MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences
Souradip Chakraborty
Jiahao Qiu
Hui Yuan
Alec Koppel
Furong Huang
Dinesh Manocha
Amrit Singh Bedi
Mengdi Wang
ALM
35
47
0
14 Feb 2024
ChatGPT vs LLaMA: Impact, Reliability, and Challenges in Stack Overflow Discussions
Leuson Da Silva
Jordan Samhi
Foutse Khomh
ALM
SILM
ELM
AI4MH
31
10
0
13 Feb 2024
Mercury: A Code Efficiency Benchmark for Code Large Language Models
Mingzhe Du
A. Luu
Bin Ji
Qian Liu
See-Kiong Ng
ALM
ELM
OffRL
24
6
0
12 Feb 2024
Calibrating Long-form Generations from Large Language Models
Yukun Huang
Yixin Liu
Raghuveer Thirukovalluru
Arman Cohan
Bhuwan Dhingra
27
7
0
09 Feb 2024
A Roadmap to Pluralistic Alignment
Taylor Sorensen
Jared Moore
Jillian R. Fisher
Mitchell L. Gordon
Niloofar Mireshghallah
...
Liwei Jiang
Ximing Lu
Nouha Dziri
Tim Althoff
Yejin Choi
65
80
0
07 Feb 2024
Transforming and Combining Rewards for Aligning Large Language Models
Zihao Wang
Chirag Nagpal
Jonathan Berant
Jacob Eisenstein
Alex DÁmour
Oluwasanmi Koyejo
Victor Veitch
19
11
0
01 Feb 2024
Even-if Explanations: Formal Foundations, Priorities and Complexity
Gianvincenzo Alfano
S. Greco
Domenico Mandaglio
Francesco Parisi
Reza Shahbazian
I. Trubitsyna
26
2
0
17 Jan 2024
Align on the Fly: Adapting Chatbot Behavior to Established Norms
Chunpu Xu
Steffi Chern
Ethan Chern
Ge Zhang
Zekun Wang
Ruibo Liu
Jing Li
Jie Fu
Pengfei Liu
24
20
0
26 Dec 2023
From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape
Timothy R. McIntosh
Teo Susnjak
Tong Liu
Paul Watters
Malka N. Halgamuge
91
46
0
18 Dec 2023
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
E. Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLM
LRM
AI4CE
27
76
0
17 Dec 2023
"I Want It That Way": Enabling Interactive Decision Support Using Large Language Models and Constraint Programming
Connor Lawless
Jakob Schoeffer
Lindy Le
Kael Rowan
Shilad Sen
Cristina St. Hill
Jina Suh
Bahar Sarrafzadeh
41
8
0
12 Dec 2023
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi
Tobia Poppi
Federico Cocchi
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
27
8
0
27 Nov 2023
Diffusion Model Alignment Using Direct Preference Optimization
Bram Wallace
Meihua Dang
Rafael Rafailov
Linqi Zhou
Aaron Lou
Senthil Purushwalkam
Stefano Ermon
Caiming Xiong
Shafiq R. Joty
Nikhil Naik
EGVM
50
227
0
21 Nov 2023
Aligned: A Platform-based Process for Alignment
Ethan Shaotran
Ido Pesok
Sam Jones
Emi Liu
19
1
0
15 Nov 2023
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer
Bowen Tan
Yun Zhu
Lijuan Liu
Eric P. Xing
Zhiting Hu
Jindong Chen
ALM
LRM
45
7
0
12 Nov 2023
Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications
Fengqing Jiang
Zhangchen Xu
Luyao Niu
Wei Ping
Jinyuan Jia
Bo Li
Radha Poovendran
AAML
21
19
0
07 Nov 2023
Leveraging Large Language Models for Collective Decision-Making
Marios Papachristou
Longqi Yang
Chin-Chia Hsu
LLMAG
39
2
0
03 Nov 2023
Knowledge Editing for Large Language Models: A Survey
Song Wang
Yaochen Zhu
Haochen Liu
Zaiyi Zheng
Chen Chen
Wenlin Yao
KELM
74
133
0
24 Oct 2023
Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature
Alejandro Lozano
Scott L. Fleming
Chia-Chun Chiang
Nigam Shah
ELM
RALM
28
32
0
24 Oct 2023
Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?
Xiaoxi Kang
Lizhen Qu
Lay-Ki Soon
Adnan Trakic
Terry Yue Zhuo
Patrick Charles Emerton
Genevieve Grant
LRM
AILaw
ELM
123
13
0
23 Oct 2023
Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation
M. Boubdir
Edward Kim
Beyza Ermis
Marzieh Fadaee
Sara Hooker
ALM
33
18
0
22 Oct 2023
Towards Understanding Sycophancy in Language Models
Mrinank Sharma
Meg Tong
Tomasz Korbak
David Duvenaud
Amanda Askell
...
Oliver Rausch
Nicholas Schiefer
Da Yan
Miranda Zhang
Ethan Perez
213
192
0
20 Oct 2023
CoMPosT: Characterizing and Evaluating Caricature in LLM Simulations
Myra Cheng
Tiziano Piccardi
Diyi Yang
LLMAG
18
67
0
17 Oct 2023
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li
Tian Xu
Yushun Zhang
Zhihang Lin
Yang Yu
Ruoyu Sun
Zhimin Luo
27
47
0
16 Oct 2023
Towards Better Evaluation of Instruction-Following: A Case-Study in Summarization
Ondrej Skopek
Rahul Aralikatte
Sian Gooding
Victor Carbune
ELM
44
18
0
12 Oct 2023
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
Hannah Rose Kirk
Andrew M. Bean
Bertie Vidgen
Paul Röttger
Scott A. Hale
ALM
21
41
0
11 Oct 2023
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
36
47
0
06 Oct 2023
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
Zhanhui Zhou
Jie Liu
Chao Yang
Jing Shao
Yu Liu
Xiangyu Yue
Wanli Ouyang
Yu Qiao
40
48
0
05 Oct 2023
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models
Hannah Rose Kirk
Bertie Vidgen
Paul Röttger
Scott A. Hale
47
2
0
03 Oct 2023
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
Silviu Pitis
35
6
0
30 Sep 2023
Decolonial AI Alignment: Openness, Viśe\d{s}a-Dharma, and Including Excluded Knowledges
Kush R. Varshney
44
2
0
10 Sep 2023
Generative Social Choice
Sara Fish
Paul Gölz
David C. Parkes
Ariel D. Procaccia
Gili Rusak
Itai Shapira
Manuel Wüthrich
30
26
0
03 Sep 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
47
473
0
27 Jul 2023
Evaluating the Moral Beliefs Encoded in LLMs
Nino Scherrer
Claudia Shi
Amir Feder
David M. Blei
33
117
0
26 Jul 2023
Decoding ChatGPT: A Taxonomy of Existing Research, Current Challenges, and Possible Future Directions
S. Sohail
Faiza Farhat
Yassine Himeur
Mohammad Nadeem
D. Madsen
Yashbir Singh
Shadi Atalla
W. Mansoor
33
115
0
26 Jul 2023
Opinion Mining Using Population-tuned Generative Language Models
Allmin Pradhap Singh Susaiyah
Abhinay Pandya
Aki Härmä
15
0
0
24 Jul 2023
Previous
1
2
3
Next