ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.09627
  4. Cited By
Mitigating Biases for Instruction-following Language Models via Bias
  Neurons Elimination
v1v2 (latest)

Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination

16 November 2023
Nakyeong Yang
Taegwan Kang
Stanley Jungkyu Choi
Honglak Lee
Kyomin Jung
ArXiv (abs)PDFHTML

Papers citing "Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination"

7 / 7 papers shown
Title
Benchmarking and Pushing the Multi-Bias Elimination Boundary of LLMs via Causal Effect Estimation-guided Debiasing
Benchmarking and Pushing the Multi-Bias Elimination Boundary of LLMs via Causal Effect Estimation-guided Debiasing
Zhouhao Sun
Zhiyuan Kan
Xiao Ding
Li Du
Yang Zhao
Bing Qin
Ting Liu
66
0
0
22 May 2025
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLMLRM
194
3,146
0
20 Oct 2022
Ethical and social risks of harm from Language Models
Ethical and social risks of harm from Language Models
Laura Weidinger
John F. J. Mellor
Maribeth Rauh
Conor Griffin
J. Uesato
...
Lisa Anne Hendricks
William S. Isaac
Sean Legassick
G. Irving
Iason Gabriel
PILM
117
1,041
0
08 Dec 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
348
1,706
0
15 Oct 2021
What Makes Good In-Context Examples for GPT-$3$?
What Makes Good In-Context Examples for GPT-333?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAMLRALM
388
1,387
0
17 Jan 2021
Studying the Inductive Biases of RNNs with Synthetic Variations of
  Natural Languages
Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
Shauli Ravfogel
Yoav Goldberg
Tal Linzen
69
71
0
15 Mar 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,182
0
20 Apr 2018
1