Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.10241
Cited By
BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs
14 July 2024
Zhiting Fan
Ruizhe Chen
Ruiling Xu
Zuozhu Liu
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs"
10 / 10 papers shown
Title
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models
Zhiting Fan
Ruizhe Chen
Zuozhu Liu
44
0
0
30 Apr 2025
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment
Xiaotian Zhang
Ruizhe Chen
Yang Feng
Zuozhu Liu
40
0
0
17 Apr 2025
No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language models
Charaka Vinayak Kumar
Ashok Urlana
Gopichand Kanumolu
B. Garlapati
Pruthwik Mishra
ELM
52
0
0
15 Mar 2025
DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models
Ruizhe Chen
Wenhao Chai
Zhifei Yang
Xiaotian Zhang
Qiufeng Wang
Tony Q. S. Quek
Soujanya Poria
Zuozhu Liu
50
0
0
06 Mar 2025
Sensing and Steering Stereotypes: Extracting and Applying Gender Representation Vectors in LLMs
Hannah Cyberey
Yangfeng Ji
David E. Evans
LLMSV
72
1
0
27 Feb 2025
FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs
Zhiting Fan
Ruizhe Chen
Tianxiang Hu
Zuozhu Liu
23
7
0
25 Oct 2024
A Comprehensive Survey of Bias in LLMs: Current Landscape and Future Directions
Rajesh Ranjan
Shailja Gupta
Surya Narayan Singh
31
9
0
24 Sep 2024
Can Tool-augmented Large Language Models be Aware of Incomplete Conditions?
Seungbin Yang
chaeHun Park
Taehee Kim
Jaegul Choo
46
2
0
18 Jun 2024
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Akari Asai
Zeqiu Wu
Yizhong Wang
Avirup Sil
Hannaneh Hajishirzi
RALM
159
631
0
17 Oct 2023
BBQ: A Hand-Built Bias Benchmark for Question Answering
Alicia Parrish
Angelica Chen
Nikita Nangia
Vishakh Padmakumar
Jason Phang
Jana Thompson
Phu Mon Htut
Sam Bowman
217
367
0
15 Oct 2021
1