ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.00731
  4. Cited By
In conversation with Artificial Intelligence: aligning language models
  with human values

In conversation with Artificial Intelligence: aligning language models with human values

1 September 2022
Atoosa Kasirzadeh
Iason Gabriel
ArXivPDFHTML

Papers citing "In conversation with Artificial Intelligence: aligning language models with human values"

43 / 43 papers shown
Title
HALO: Human-Aligned End-to-end Image Retargeting with Layered Transformations
HALO: Human-Aligned End-to-end Image Retargeting with Layered Transformations
Yiran Xu
Siqi Xie
Zhuofang Li
Harris Shadmany
Yinxiao Li
...
Jesse Berent
Ming Yang
Irfan Essa
Jia-Bin Huang
Feng Yang
VOS
58
0
0
03 Apr 2025
Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals
Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals
Lucy Havens
Benjamin Bach
Melissa Mhairi Terras
Beatrice Alex
46
0
0
01 Apr 2025
A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World Applications
A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World Applications
Jian-Yu Guan
J. Wu
J. Li
Chuanqi Cheng
Wei Yu Wu
LM&MA
69
0
0
21 Mar 2025
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment
J. Li
Jian-Yu Guan
Songhao Wu
Wei Yu Wu
Rui Yan
62
1
0
19 Mar 2025
On Benchmarking Human-Like Intelligence in Machines
On Benchmarking Human-Like Intelligence in Machines
Lance Ying
K. M. Collins
L. Wong
Ilia Sucholutsky
Ryan Liu
Adrian Weller
Tianmin Shu
Thomas L. Griffiths
Joshua B. Tenenbaum
ALM
ELM
113
3
0
27 Feb 2025
Do LLMs exhibit demographic parity in responses to queries about Human Rights?
Do LLMs exhibit demographic parity in responses to queries about Human Rights?
Rafiya Javed
Jackie Kay
David Yanni
Abdullah Zaini
Anushe Sheikh
Maribeth Rauh
Iason Gabriel
Laura Weidinger
59
0
0
26 Feb 2025
CURATe: Benchmarking Personalised Alignment of Conversational AI Assistants
CURATe: Benchmarking Personalised Alignment of Conversational AI Assistants
Lize Alberts
Benjamin Ellis
Andrei Lupu
Jakob Foerster
ELM
34
1
0
28 Oct 2024
Ethics Whitepaper: Whitepaper on Ethical Research into Large Language
  Models
Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models
Eddie L. Ungless
Nikolas Vitsakis
Zeerak Talat
James Garforth
Bjorn Ross
Arno Onken
Atoosa Kasirzadeh
Alexandra Birch
28
1
0
17 Oct 2024
Beyond Preferences in AI Alignment
Beyond Preferences in AI Alignment
Tan Zhi-Xuan
Micah Carroll
Matija Franklin
Hal Ashton
33
16
0
30 Aug 2024
Epistemic Injustice in Generative AI
Epistemic Injustice in Generative AI
Jackie Kay
Atoosa Kasirzadeh
Shakir Mohamed
AILaw
34
6
0
21 Aug 2024
On the Undecidability of Artificial Intelligence Alignment: Machines
  that Halt
On the Undecidability of Artificial Intelligence Alignment: Machines that Halt
Gabriel Adriano de Melo
Marcos Ricardo Omena de Albuquerque Máximo
Nei Yoshihiro Soma
Paulo Andre Lima de Castro
25
0
0
16 Aug 2024
Adaptive Retrieval-Augmented Generation for Conversational Systems
Adaptive Retrieval-Augmented Generation for Conversational Systems
Xi Wang
Procheta Sen
Ruizhe Li
Emine Yilmaz
RALM
26
5
0
31 Jul 2024
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang
Peng Wang
Tong Zhou
Yushun Dong
Zhen Tan
Jundong Li
CoGe
44
6
0
02 Jul 2024
A Robot Walks into a Bar: Can Language Models Serve as Creativity
  Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with
  Comedians
A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians
Piotr Wojciech Mirowski
Juliette Love
K. Mathewson
Shakir Mohamed
32
19
0
31 May 2024
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models
Chia-Yi Hsu
Yu-Lin Tsai
Chih-Hsun Lin
Pin-Yu Chen
Chia-Mu Yu
Chun-ying Huang
44
32
0
27 May 2024
Sociotechnical Implications of Generative Artificial Intelligence for
  Information Access
Sociotechnical Implications of Generative Artificial Intelligence for Information Access
Bhaskar Mitra
Henriette Cramer
Olya Gurevich
42
2
0
19 May 2024
ChatGPT Role-play Dataset: Analysis of User Motives and Model
  Naturalness
ChatGPT Role-play Dataset: Analysis of User Motives and Model Naturalness
Sabrina Bodmer
Ameeta Agrawal
Judit Dombi
Tetyana Sydorenko
Jung In Lee
19
4
0
26 Mar 2024
Assessment of Multimodal Large Language Models in Alignment with Human
  Values
Assessment of Multimodal Large Language Models in Alignment with Human Values
Zhelun Shi
Zhipin Wang
Hongxing Fan
Zaibin Zhang
Lijun Li
Yongting Zhang
Zhen-fei Yin
Lu Sheng
Yu Qiao
Jing Shao
32
14
0
26 Mar 2024
Language Models in Dialogue: Conversational Maxims for Human-AI
  Interactions
Language Models in Dialogue: Conversational Maxims for Human-AI Interactions
Erik Miehling
Manish Nagireddy
P. Sattigeri
Elizabeth M. Daly
David Piorkowski
John T. Richards
ALM
34
11
0
22 Mar 2024
Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by
  Exploring Refusal Loss Landscapes
Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes
Xiaomeng Hu
Pin-Yu Chen
Tsung-Yi Ho
AAML
24
26
0
01 Mar 2024
I Am Not Them: Fluid Identities and Persistent Out-group Bias in Large
  Language Models
I Am Not Them: Fluid Identities and Persistent Out-group Bias in Large Language Models
Wenchao Dong
Assem Zhunis
Hyojin Chin
Jiyoung Han
Meeyoung Cha
30
2
0
16 Feb 2024
Mapping the Ethics of Generative AI: A Comprehensive Scoping Review
Mapping the Ethics of Generative AI: A Comprehensive Scoping Review
Thilo Hagendorff
21
35
0
13 Feb 2024
How do Large Language Models Navigate Conflicts between Honesty and
  Helpfulness?
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
Ryan Liu
T. Sumers
Ishita Dasgupta
Thomas L. Griffiths
LLMAG
35
13
0
11 Feb 2024
From Bytes to Biases: Investigating the Cultural Self-Perception of
  Large Language Models
From Bytes to Biases: Investigating the Cultural Self-Perception of Large Language Models
Wolfgang Messner
Tatum Greene
Josephine Matalone
19
4
0
21 Dec 2023
A Survey of the Evolution of Language Model-Based Dialogue Systems
A Survey of the Evolution of Language Model-Based Dialogue Systems
Hongru Wang
Lingzhi Wang
Yiming Du
Liang Chen
Jing Zhou
Yufei Wang
Kam-Fai Wong
LRM
53
20
0
28 Nov 2023
Knowledge Editing for Large Language Models: A Survey
Knowledge Editing for Large Language Models: A Survey
Song Wang
Yaochen Zhu
Haochen Liu
Zaiyi Zheng
Chen Chen
Jundong Li
KELM
66
133
0
24 Oct 2023
Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal
  Scenarios Like a Lawyer?
Can ChatGPT Perform Reasoning Using the IRAC Method in Analyzing Legal Scenarios Like a Lawyer?
Xiaoxi Kang
Lizhen Qu
Lay-Ki Soon
Adnan Trakic
Terry Yue Zhuo
Patrick Charles Emerton
Genevieve Grant
LRM
AILaw
ELM
115
13
0
23 Oct 2023
The Empty Signifier Problem: Towards Clearer Paradigms for
  Operationalising "Alignment" in Large Language Models
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models
Hannah Rose Kirk
Bertie Vidgen
Paul Röttger
Scott A. Hale
39
2
0
03 Oct 2023
Cultural Alignment in Large Language Models: An Explanatory Analysis
  Based on Hofstede's Cultural Dimensions
Cultural Alignment in Large Language Models: An Explanatory Analysis Based on Hofstede's Cultural Dimensions
Reem I. Masoud
Ziquan Liu
Martin Ferianc
Philip C. Treleaven
Miguel R. D. Rodrigues
19
50
0
25 Aug 2023
Artificial Intelligence and Aesthetic Judgment
Artificial Intelligence and Aesthetic Judgment
Jessica Hullman
Ari Holtzman
Andrew Gelman
15
3
0
21 Aug 2023
ChatGPT in the Age of Generative AI and Large Language Models: A Concise Survey
S. Mohamadi
G. Mujtaba
Ngan Le
Gianfranco Doretto
Don Adjeroh
LM&MA
AI4MH
23
21
0
09 Jul 2023
Towards Measuring the Representation of Subjective Global Opinions in
  Language Models
Towards Measuring the Representation of Subjective Global Opinions in Language Models
Esin Durmus
Karina Nyugen
Thomas I. Liao
Nicholas Schiefer
Amanda Askell
...
Alex Tamkin
Janel Thamkul
Jared Kaplan
Jack Clark
Deep Ganguli
33
205
0
28 Jun 2023
Appropriateness is all you need!
Appropriateness is all you need!
Hendrik Kempt
A. Lavie
S. Nagel
20
1
0
27 Apr 2023
ChatGPT, Large Language Technologies, and the Bumpy Road of Benefiting
  Humanity
ChatGPT, Large Language Technologies, and the Bumpy Road of Benefiting Humanity
Atoosa Kasirzadeh
SILM
13
2
0
21 Apr 2023
Personalisation within bounds: A risk taxonomy and policy framework for
  the alignment of large language models with personalised feedback
Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback
Hannah Rose Kirk
Bertie Vidgen
Paul Röttger
Scott A. Hale
33
99
0
09 Mar 2023
Robust Weight Signatures: Gaining Robustness as Easy as Patching
  Weights?
Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?
Ruisi Cai
Zhenyu (Allen) Zhang
Zhangyang Wang
AAML
OOD
22
12
0
24 Feb 2023
Auditing large language models: a three-layered approach
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILaw
MLAU
39
194
0
16 Feb 2023
Inclusive Artificial Intelligence
Inclusive Artificial Intelligence
Dilip Arumugam
Shi Dong
Benjamin Van Roy
40
1
0
24 Dec 2022
Editing Models with Task Arithmetic
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
43
424
0
08 Dec 2022
Truthful AI: Developing and governing AI that does not lie
Truthful AI: Developing and governing AI that does not lie
Owain Evans
Owen Cotton-Barratt
Lukas Finnveden
Adam Bales
Avital Balwit
Peter Wills
Luca Righetti
William Saunders
HILM
228
109
0
13 Oct 2021
Challenges in Detoxifying Language Models
Challenges in Detoxifying Language Models
Johannes Welbl
Amelia Glaese
J. Uesato
Sumanth Dathathri
John F. J. Mellor
Lisa Anne Hendricks
Kirsty Anderson
Pushmeet Kohli
Ben Coppin
Po-Sen Huang
LM&MA
242
193
0
15 Sep 2021
Understanding the Capabilities, Limitations, and Societal Impact of
  Large Language Models
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Alex Tamkin
Miles Brundage
Jack Clark
Deep Ganguli
AILaw
ELM
200
258
0
04 Feb 2021
A Survey on Bias and Fairness in Machine Learning
A Survey on Bias and Fairness in Machine Learning
Ninareh Mehrabi
Fred Morstatter
N. Saxena
Kristina Lerman
Aram Galstyan
SyDa
FaML
314
4,203
0
23 Aug 2019
1