Whose Opinions Do Language Models Reflect?

30 March 2023

Esin Durmus

Tatsunori Hashimoto

Papers citing "Whose Opinions Do Language Models Reflect?"

28 / 278 papers shown

Title
Opportunities and Risks of LLMs for Scalable Deliberation with Polis Christopher T. Small Ivan Vendrov Esin Durmus Hadjar Homaei Elizabeth Barry Julien Cornebise Ted Suzman Deep Ganguli Colin Megill 32 27 0 20 Jun 2023
Intersectionality in Conversational AI Safety: How Bayesian Multilevel Models Help Understand Diverse Perceptions of Safety Christopher Homan Greg Serapio-García Lora Aroyo Mark Díaz Alicia Parrish Vinodkumar Prabhakaran Alex S. Taylor Ding Wang 24 9 0 20 Jun 2023
DICES Dataset: Diversity in Conversational AI Evaluation for Safety Lora Aroyo Alex S. Taylor Mark Díaz Christopher Homan Alicia Parrish Greg Serapio-García Vinodkumar Prabhakaran Ding Wang 32 33 0 20 Jun 2023
Explore, Establish, Exploit: Red Teaming Language Models from Scratch Stephen Casper Jason Lin Joe Kwon Gatlen Culp Dylan Hadfield-Menell AAML 8 83 0 15 Jun 2023
Questioning the Survey Responses of Large Language Models Ricardo Dominguez-Olmedo Moritz Hardt Celestine Mendler-Dünner 34 31 0 13 Jun 2023
Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks V. Veselovsky Manoel Horta Ribeiro Robert West 28 130 0 13 Jun 2023
Evaluating the Social Impact of Generative AI Systems in Systems and Society Irene Solaiman Zeerak Talat William Agnew Lama Ahmad Dylan K. Baker ... Marie-Therese Png Shubham Singh A. Strait Lukas Struppek Arjun Subramonian ELM EGVM 31 104 0 09 Jun 2023
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards Alexandre Ramé Guillaume Couairon Mustafa Shukor Corentin Dancette Jean-Baptiste Gaya Laure Soulier Matthieu Cord MoMe 35 136 0 07 Jun 2023
Generative AI for Product Design: Getting the Right Design and the Design Right Matthew K. Hong Shabnam Hakimi Yan-Ying Chen Heishiro Toyoda Charlene C. Wu M. Klenk AI4CE 24 16 0 02 Jun 2023
Revisiting the Reliability of Psychological Scales on Large Language Models Jen-tse Huang Wenxuan Wang Man Ho Lam E. Li Wenxiang Jiao Michael R. Lyu 42 21 0 31 May 2023
Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models Myra Cheng Esin Durmus Dan Jurafsky 30 176 0 29 May 2023
Aligning Language Models to User Opinions EunJeong Hwang Bodhisattwa Prasad Majumder Niket Tandon 29 62 0 24 May 2023
Psychological Metrics for Dialog System Evaluation Salvatore Giorgi Shreya Havaldar Farhan S. Ahmed Zuhaib Akhtar Shalaka Vaidya Gary Pan Pallavi V. Kulkarni H. A. Schwartz Joao Sedoc 22 2 0 24 May 2023
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models Tarek Naous Michael Joseph Ryan Alan Ritter Wei-ping Xu 37 85 0 23 May 2023
Can Large Language Models Capture Dissenting Human Voices? Noah Lee Na Min An James Thorne ALM 47 30 0 23 May 2023
Natural Language Decompositions of Implicit Content Enable Better Text Representations Alexander Miserlis Hoyle Rupak Sarkar Pranav Goel Philip Resnik AI4CE 46 12 0 23 May 2023
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation Patrick Fernandes Aman Madaan Emmy Liu António Farinhas Pedro Henrique Martins ... José G. C. de Souza Shuyan Zhou Tongshuang Wu Graham Neubig André F. T. Martins ALM 117 56 0 01 May 2023
Domain Mastery Benchmark: An Ever-Updating Benchmark for Evaluating Holistic Domain Knowledge of Large Language Model--A Preliminary Release Zhouhong Gu Xiaoxuan Zhu Haoning Ye Lin Zhang Zhuozhi Xiong Zihan Li Qi He Sihang Jiang Hongwei Feng Yanghua Xiao ELM ALM 47 2 0 23 Apr 2023
Evaluating ChatGPT's Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness Bo Li Gexiang Fang Yang Yang Quansen Wang Wei Ye Wen Zhao Shikun Zhang ELM AI4MH 27 158 0 23 Apr 2023
Can Large Language Models Transform Computational Social Science? Caleb Ziems William B. Held Omar Shaikh Jiaao Chen Zhehao Zhang Diyi Yang LLMAG 36 286 0 12 Apr 2023
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection Kai Greshake Sahar Abdelnabi Shailesh Mishra C. Endres Thorsten Holz Mario Fritz SILM 49 439 0 23 Feb 2023
Reward Gaming in Conditional Text Generation Richard Yuanzhe Pang Vishakh Padmakumar Thibault Sellam Ankur P. Parikh He He 35 24 0 16 Nov 2022
Improving alignment of dialogue agents via targeted human judgements Amelia Glaese Nat McAleese Maja Trkebacz John Aslanides Vlad Firoiu ... John F. J. Mellor Demis Hassabis Koray Kavukcuoglu Lisa Anne Hendricks G. Irving ALM AAML 230 506 0 28 Sep 2022
Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity Gabriel Simmons 105 57 0 24 Sep 2022
CommunityLM: Probing Partisan Worldviews from Language Models Hang Jiang Doug Beeferman Brandon Roy Dwaipayan Roy 101 32 0 15 Sep 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned Deep Ganguli Liane Lovitt John Kernion Amanda Askell Yuntao Bai ... Nicholas Joseph Sam McCandlish C. Olah Jared Kaplan Jack Clark 231 446 0 23 Aug 2022
Using cognitive psychology to understand GPT-3 Marcel Binz Eric Schulz ELM LLMAG 252 440 0 21 Jun 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 357 12,003 0 04 Mar 2022