Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.11436
Cited By
v1
v2 (latest)
Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement
18 February 2024
Wenda Xu
Guanglei Zhu
Xuandong Zhao
Liangming Pan
Lei Li
Wenjie Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Github (8★)
Papers citing
"Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement"
9 / 9 papers shown
Title
Societal Impacts Research Requires Benchmarks for Creative Composition Tasks
Judy Hanwen Shen
Carlos Guestrin
185
1
0
09 Apr 2025
Cognitive Debiasing Large Language Models for Decision-Making
Yougang Lyu
Shijie Ren
Yue Feng
Zihan Wang
Zhongfu Chen
Zhaochun Ren
Maarten de Rijke
184
0
0
05 Apr 2025
Preference Leakage: A Contamination Problem in LLM-as-a-judge
Dawei Li
Renliang Sun
Yue Huang
Ming Zhong
Bohan Jiang
Jiawei Han
Wei Wei
Wei Wang
Huan Liu
140
29
0
03 Feb 2025
Visual Prompting with Iterative Refinement for Design Critique Generation
Peitong Duan
Chin-Yi Cheng
Bjoern Hartmann
Yang Li
133
0
0
22 Dec 2024
Improving Model Factuality with Fine-grained Critique-based Evaluator
Yiqing Xie
Wenxuan Zhou
Pradyot Prakash
Di Jin
Yuning Mao
...
Sinong Wang
Han Fang
Carolyn Rose
Daniel Fried
Hejia Zhang
HILM
126
8
0
24 Oct 2024
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems
Nandan Thakur
Suleman Kazi
Ge Luo
Jimmy J. Lin
Amin Ahmad
VLM
RALM
197
7
0
17 Oct 2024
Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks
Jiayi He
Hehai Lin
Q. Wang
Yi R. Fung
Chenhui Xu
ReLM
LRM
193
7
0
05 Oct 2024
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation
Ilya Gusev
LLMAG
97
3
0
10 Sep 2024
From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks
Andreas Stephan
D. Zhu
Matthias Aßenmacher
Xiaoyu Shen
Benjamin Roth
ELM
87
5
0
06 Sep 2024
1