ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.06415
  4. Cited By
Down the Toxicity Rabbit Hole: A Novel Framework to Bias Audit Large
  Language Models

Down the Toxicity Rabbit Hole: A Novel Framework to Bias Audit Large Language Models

8 September 2023
Arka Dutta
Adel Khorramrouz
Sujan Dutta
Ashiqur R. KhudaBukhsh
ArXivPDFHTML

Papers citing "Down the Toxicity Rabbit Hole: A Novel Framework to Bias Audit Large Language Models"

3 / 3 papers shown
Title
Improving alignment of dialogue agents via targeted human judgements
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALM
AAML
227
502
0
28 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
319
11,953
0
04 Mar 2022
A Framework for the Computational Linguistic Analysis of Dehumanization
A Framework for the Computational Linguistic Analysis of Dehumanization
Julia Mendelsohn
Yulia Tsvetkov
Dan Jurafsky
84
89
0
06 Mar 2020
1