ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12146
  4. Cited By
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
v1v2v3 (latest)

Enabling Weak LLMs to Judge Response Reliability via Meta Ranking

19 February 2024
Zijun Liu
Boqun Kou
Peng Li
Ming Yan
Ji Zhang
Fei Huang
Yang Liu
ArXiv (abs)PDFHTML

Papers citing "Enabling Weak LLMs to Judge Response Reliability via Meta Ranking"

25 / 25 papers shown
Title
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang
Qing Yang
Zhiyuan Zeng
Liliang Ren
Liu Liu
...
Jianfeng Gao
Weizhu Chen
Shuaiqiang Wang
Simon Shaolei Du
Yelong Shen
OffRLReLMLRM
299
47
0
29 Apr 2025
Language Model Cascades: Token-level uncertainty and beyond
Language Model Cascades: Token-level uncertainty and beyond
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
126
56
0
15 Apr 2024
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for
  Instruction Fine-Tuning
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
Hao Zhao
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
ALM
147
56
0
07 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
275
569
0
02 Feb 2024
Navigating Uncertainty: Optimizing API Dependency for Hallucination
  Reduction in Closed-Book Question Answering
Navigating Uncertainty: Optimizing API Dependency for Hallucination Reduction in Closed-Book Question Answering
Pierre Erbacher
Louis Falissard
Vincent Guigue
Laure Soulier
HILMRALM
69
4
0
03 Jan 2024
What Makes Good Data for Alignment? A Comprehensive Study of Automatic
  Data Selection in Instruction Tuning
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
Wei Liu
Weihao Zeng
Keqing He
Yong Jiang
Junxian He
ALM
101
239
0
25 Dec 2023
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Renat Aksitov
Sobhan Miryoosefi
Zong-xiao Li
Daliang Li
Sheila Babayan
...
Sushant Prakash
Pranesh Srinivasan
Manzil Zaheer
Felix X. Yu
Sanjiv Kumar
LRMReLMLLMAGKELM
90
53
0
15 Dec 2023
LM-Polygraph: Uncertainty Estimation for Language Models
LM-Polygraph: Uncertainty Estimation for Language Models
Ekaterina Fadeeva
Roman Vashurin
Akim Tsvigun
Artem Vazhentsev
Sergey Petrakov
...
Elizaveta Goncharova
Alexander Panchenko
Maxim Panov
Timothy Baldwin
Artem Shelmanov
57
68
0
13 Nov 2023
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Lianghui Zhu
Xinggang Wang
Xinlong Wang
ELMALM
119
142
0
26 Oct 2023
Reinforced Self-Training (ReST) for Language Modeling
Reinforced Self-Training (ReST) for Language Modeling
Çağlar Gülçehre
T. Paine
S. Srinivasan
Ksenia Konyushkova
L. Weerts
...
Chenjie Gu
Wolfgang Macherey
Arnaud Doucet
Orhan Firat
Nando de Freitas
OffRL
125
309
0
17 Aug 2023
CMMLU: Measuring massive multitask language understanding in Chinese
CMMLU: Measuring massive multitask language understanding in Chinese
Haonan Li
Yixuan Zhang
Fajri Koto
Yifei Yang
Hai Zhao
Yeyun Gong
Nan Duan
Tim Baldwin
ALMELM
112
273
0
15 Jun 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALMOSLMELM
438
4,444
0
09 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward
  Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
389
4,163
0
29 May 2023
Do Large Language Models Know What They Don't Know?
Do Large Language Models Know What They Don't Know?
Zhangyue Yin
Qiushi Sun
Qipeng Guo
Jiawen Wu
Xipeng Qiu
Xuanjing Huang
ELMAI4MH
78
163
0
29 May 2023
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual
  Benchmarking on HumanEval-X
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X
Qinkai Zheng
Xiao Xia
Xu Zou
Yuxiao Dong
Shanshan Wang
...
Andi Wang
Yang Li
Teng Su
Zhilin Yang
Jie Tang
ELMALMSyDa
130
341
0
30 Mar 2023
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDLLRM
362
1,094
0
05 Oct 2022
Language Models (Mostly) Know What They Know
Language Models (Mostly) Know What They Know
Saurav Kadavath
Tom Conerly
Amanda Askell
T. Henighan
Dawn Drain
...
Nicholas Joseph
Benjamin Mann
Sam McCandlish
C. Olah
Jared Kaplan
ELM
127
833
0
11 Jul 2022
No Language Left Behind: Scaling Human-Centered Machine Translation
No Language Left Behind: Scaling Human-Centered Machine Translation
Nllb team
Marta R. Costa-jussá
James Cross
Onur cCelebi
Maha Elbayad
...
Alexandre Mourachko
C. Ropers
Safiyyah Saleem
Holger Schwenk
Jeff Wang
MoE
232
1,268
0
11 Jul 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
886
13,207
0
04 Mar 2022
Learning Compact Metrics for MT
Learning Compact Metrics for MT
Amy Pu
Hyung Won Chung
Ankur P. Parikh
Sebastian Gehrmann
Thibault Sellam
77
101
0
12 Oct 2021
GLM: General Language Model Pretraining with Autoregressive Blank
  Infilling
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDLAI4CE
142
1,554
0
18 Mar 2021
Reducing conversational agents' overconfidence through linguistic
  calibration
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
290
170
0
30 Dec 2020
Aligning AI With Shared Human Values
Aligning AI With Shared Human Values
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
Jingkai Li
Basel Alomair
Jacob Steinhardt
145
574
0
05 Aug 2020
BLEURT: Learning Robust Metrics for Text Generation
BLEURT: Learning Robust Metrics for Text Generation
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
103
1,506
0
09 Apr 2020
A Call for Clarity in Reporting BLEU Scores
A Call for Clarity in Reporting BLEU Scores
Matt Post
181
2,998
0
23 Apr 2018
1