Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.03021
Cited By
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems
6 April 2022
Caleb Ziems
Jane A. Yu
Yi-Chia Wang
A. Halevy
Diyi Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems"
21 / 21 papers shown
Title
Societal Alignment Frameworks Can Improve LLM Alignment
Karolina Stañczak
Nicholas Meade
Mehar Bhatia
Hattie Zhou
Konstantin Böttinger
...
Timothy P. Lillicrap
Ana Marasović
Sylvie Delacroix
Gillian K. Hadfield
Siva Reddy
147
0
0
27 Feb 2025
AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Zhe Su
Xuhui Zhou
Sanketh Rangreji
Anubha Kabra
Julia Mendelsohn
Faeze Brahman
Maarten Sap
LLMAG
106
3
0
13 Sep 2024
CELL your Model: Contrastive Explanations for Large Language Models
Ronny Luss
Erik Miehling
Amit Dhurandhar
47
0
0
17 Jun 2024
SaGE: Evaluating Moral Consistency in Large Language Models
Vamshi Krishna Bonagiri
Sreeram Vennam
Priyanshul Govil
Ponnurangam Kumaraguru
Manas Gaur
ELM
56
0
0
21 Feb 2024
Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments
Liesbeth Allein
Maria Mihaela Trucscva
Marie-Francine Moens
33
1
0
27 Nov 2023
Large Language Models in Education: Vision and Opportunities
Wensheng Gan
Zhenlian Qi
Jiayang Wu
Chun-Wei Lin
AI4Ed
44
70
0
22 Nov 2023
STREAM: Social data and knowledge collective intelligence platform for TRaining Ethical AI Models
Yuwei Wang
Enmeng Lu
Zizhe Ruan
Yao Liang
Yi Zeng
AI4TS
29
4
0
09 Oct 2023
NormBank: A Knowledge Bank of Situational Social Norms
Caleb Ziems
Jane Dwivedi-Yu
Yi-Chia Wang
A. Halevy
Diyi Yang
23
41
0
26 May 2023
NormMark: A Weakly Supervised Markov Model for Socio-cultural Norm Discovery
Farhad Moghimifar
Shilin Qu
Tongtong Wu
Yuan-Fang Li
Gholamreza Haffari
34
4
0
26 May 2023
Affective Faces for Goal-Driven Dyadic Communication
Scott Geng
Revant Teotia
Purva Tendulkar
Sachit Menon
Carl Vondrick
VGen
26
18
0
26 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
99
35
0
01 Jan 2023
SafeText: A Benchmark for Exploring Physical Safety in Language Models
Sharon Levy
Emily Allaway
Melanie Subbiah
Lydia B. Chilton
D. Patton
Kathleen McKeown
William Yang Wang
59
40
0
18 Oct 2022
NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
Yi Ren Fung
Tuhin Chakraborty
Hao Guo
Owen Rambow
Smaranda Muresan
Heng Ji
21
39
0
16 Oct 2022
Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity
Gabriel Simmons
105
57
0
24 Sep 2022
Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans
John J. Nay
ELM
AILaw
88
27
0
14 Sep 2022
Target-Guided Dialogue Response Generation Using Commonsense and Data Augmentation
Prakhar Gupta
Harsh Jhamtani
Jeffrey P. Bigham
49
12
0
19 May 2022
A Word on Machine Ethics: A Response to Jiang et al. (2021)
Zeerak Talat
Hagen Blix
Josef Valvoda
M. I. Ganesh
Ryan Cotterell
Adina Williams
SyDa
FaML
96
38
0
07 Nov 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
259
374
0
28 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
261
1,996
0
31 Dec 2020
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
280
1,595
0
18 Sep 2019
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
214
1,327
0
05 Jun 2016
1