ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2206.05802
  4. Cited By
Self-critiquing models for assisting human evaluators
v1v2 (latest)

Self-critiquing models for assisting human evaluators

12 June 2022
William Saunders
Catherine Yeh
Jeff Wu
Steven Bills
Ouyang Long
Jonathan Ward
Jan Leike
    ALMELM
ArXiv (abs)PDFHTML

Papers citing "Self-critiquing models for assisting human evaluators"

50 / 238 papers shown
Title
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
Zhuoxuan Jiang
Haoyuan Peng
Shanshan Feng
Fan Li
Dongsheng Li
KELMLRM
108
16
0
09 May 2024
Conversational Topic Recommendation in Counseling and Psychotherapy with
  Decision Transformer and Large Language Models
Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models
Aylin Gunal
Baihan Lin
Djallel Bouneffouf
OffRLAI4MHLM&MA
68
1
0
08 May 2024
General Purpose Verification for Chain of Thought Prompting
General Purpose Verification for Chain of Thought Prompting
Robert Vacareanu
Anurag Pratik
Evangelia Spiliopoulou
Zheng Qi
Giovanni Paolini
Neha Ann John
Jie Ma
Yassine Benajiba
Miguel Ballesteros
LRM
44
10
0
30 Apr 2024
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Yunxiang Zhang
Muhammad Khalifa
Lajanugen Logeswaran
Jaekyeom Kim
Moontae Lee
Honglak Lee
Lu Wang
LRMKELMReLM
125
43
0
26 Apr 2024
Aligning LLM Agents by Learning Latent Preference from User Edits
Aligning LLM Agents by Learning Latent Preference from User Edits
Ge Gao
Alexey Taymanov
Eduardo Salinas
Paul Mineiro
Dipendra Kumar Misra
LLMAG
94
31
0
23 Apr 2024
A Survey on Self-Evolution of Large Language Models
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRMLM&Ro
101
27
0
22 Apr 2024
Toward Self-Improvement of LLMs via Imagination, Searching, and
  Criticizing
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Ye Tian
Baolin Peng
Linfeng Song
Lifeng Jin
Dian Yu
Haitao Mi
Dong Yu
LRMReLM
112
85
0
18 Apr 2024
LLM Evaluators Recognize and Favor Their Own Generations
LLM Evaluators Recognize and Favor Their Own Generations
Arjun Panickssery
Samuel R. Bowman
Shi Feng
124
197
0
15 Apr 2024
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State
  Transition Dynamics
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics
Derui Zhu
Dingfan Chen
Qing Li
Zongxiong Chen
Lei Ma
Jens Grossklags
Mario Fritz
HILM
89
14
0
06 Apr 2024
IterAlign: Iterative Constitutional Alignment of Large Language Models
IterAlign: Iterative Constitutional Alignment of Large Language Models
Xiusi Chen
Hongzhi Wen
Sreyashi Nag
Chen Luo
Qingyu Yin
Ruirui Li
Zheng Li
Wei Wang
AILaw
36
6
0
27 Mar 2024
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with
  Autoformalization
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization
Jin Peng Zhou
Charles Staats
Wenda Li
Christian Szegedy
Kilian Q. Weinberger
Yuhuai Wu
LRM
82
39
0
26 Mar 2024
STRUM-LLM: Attributed and Structured Contrastive Summarization
STRUM-LLM: Attributed and Structured Contrastive Summarization
Beliz Gunel
James Bradley Wendt
Jing Xie
Yichao Zhou
Nguyen Vo
Zachary Fisher
Sandeep Tata
50
5
0
25 Mar 2024
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding
Ahmad A Mahmood
Ashmal Vayani
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
LRM
181
9
0
21 Mar 2024
Facilitating Pornographic Text Detection for Open-Domain Dialogue
  Systems via Knowledge Distillation of Large Language Models
Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models
Huachuan Qiu
Shuai Zhang
Hongliang He
Anqi Li
Zhenzhong Lan
94
1
0
20 Mar 2024
Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text
  Transformation
Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation
Yunhao Gou
Kai Chen
Zhili Liu
Lanqing Hong
Hang Xu
Zhenguo Li
Dit-Yan Yeung
James T. Kwok
Yu Zhang
MLLM
125
52
0
14 Mar 2024
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Zhiqing Sun
Longhui Yu
Yikang Shen
Weiyang Liu
Yiming Yang
Sean Welleck
Chuang Gan
93
69
0
14 Mar 2024
Self-Refinement of Language Models from External Proxy Metrics Feedback
Self-Refinement of Language Models from External Proxy Metrics Feedback
Keshav Ramji
Young-Suk Lee
Ramón Fernandez Astudillo
M. Sultan
Tahira Naseem
Asim Munawar
Radu Florian
Salim Roukos
HILM
77
7
0
27 Feb 2024
TruthX: Alleviating Hallucinations by Editing Large Language Models in
  Truthful Space
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space
Shaolei Zhang
Tian Yu
Yang Feng
HILMKELM
106
52
0
27 Feb 2024
Navigating Complexity: Orchestrated Problem Solving with Multi-Agent
  LLMs
Navigating Complexity: Orchestrated Problem Solving with Multi-Agent LLMs
Sumedh Rasal
E. Hauer
69
0
0
26 Feb 2024
TEaR: Improving LLM-based Machine Translation with Systematic
  Self-Refinement
TEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement
Zhaopeng Feng
Yan Zhang
Hao Li
Bei Wu
Jiayu Liao
Wenqiang Liu
Jun Lang
Yang Feng
Jian Wu
Zuozhu Liu
LRM
138
15
0
26 Feb 2024
Look Before You Leap: Problem Elaboration Prompting Improves
  Mathematical Reasoning in Large Language Models
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models
Haoran Liao
Jidong Tian
Shaohua Hu
Hao He
Yaohui Jin
ReLMLRM
86
0
0
24 Feb 2024
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
Zicheng Lin
Zhibin Gou
Tian Liang
Ruilin Luo
Haowei Liu
Yujiu Yang
LRM
109
56
0
22 Feb 2024
Q-Probe: A Lightweight Approach to Reward Maximization for Language
  Models
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Kenneth Li
Samy Jelassi
Hugh Zhang
Sham Kakade
Martin Wattenberg
David Brandfonbrener
141
11
0
22 Feb 2024
CriticBench: Evaluating Large Language Models as Critic
CriticBench: Evaluating Large Language Models as Critic
Tian Lan
Wenwei Zhang
Chen Xu
Heyan Huang
Dahua Lin
Kai-xiang Chen
Xian-Ling Mao
ELMAI4MHLRM
86
3
0
21 Feb 2024
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue
  Summarization
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Liyan Tang
Igor Shalyminov
Amy Wing-mei Wong
Jon Burnsky
Jake W. Vincent
...
Hang Su
Lijia Sun
Yi Zhang
Saab Mansour
Kathleen McKeown
HILM
78
54
0
20 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELMVLM
175
135
0
20 Feb 2024
Learning to Check: Unleashing Potentials for Self-Correction in Large
  Language Models
Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
Che Zhang
Zhenyang Xiao
Chengcheng Han
Yixin Lian
Yuejian Fang
LRM
59
0
0
20 Feb 2024
Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of
  Large Language Models
Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models
Loka Li
Zhenhao Chen
Guan-Hong Chen
Yixuan Zhang
Yusheng Su
Eric P. Xing
Kun Zhang
LRM
93
19
0
19 Feb 2024
FactPICO: Factuality Evaluation for Plain Language Summarization of
  Medical Evidence
FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence
Sebastian Antony Joseph
Lily Chen
Jan Trienes
Hannah Louisa Göke
Monika Coers
Wei Xu
Byron C. Wallace
Junyi Jessy Li
LM&MAHILM
77
11
0
18 Feb 2024
Dissecting Human and LLM Preferences
Dissecting Human and LLM Preferences
Junlong Li
Fan Zhou
Shichao Sun
Yikai Zhang
Hai Zhao
Pengfei Liu
ALM
89
6
0
17 Feb 2024
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via
  Self-Evaluation
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
Xiaoying Zhang
Baolin Peng
Ye Tian
Jingyan Zhou
Lifeng Jin
Linfeng Song
Haitao Mi
Helen Meng
HILM
91
52
0
14 Feb 2024
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and
  Local Refinements
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Alex Havrilla
Sharath Raparthy
Christoforus Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Roberta Railneau
ReLMLRM
98
65
0
13 Feb 2024
Large Language Models as Agents in Two-Player Games
Large Language Models as Agents in Two-Player Games
Yang Liu
Peng Sun
Hang Li
LLMAG
80
4
0
12 Feb 2024
Self-Correcting Self-Consuming Loops for Generative Model Training
Self-Correcting Self-Consuming Loops for Generative Model Training
Nate Gillman
Michael Freeman
Daksh Aggarwal
Chia-Hong Hsu
Calvin Luo
Yonglong Tian
Chen Sun
113
16
0
11 Feb 2024
Aligner: Efficient Alignment by Learning to Correct
Aligner: Efficient Alignment by Learning to Correct
Jiaming Ji
Boyuan Chen
Hantao Lou
Chongye Guo
Borong Zhang
Xuehai Pan
Juntao Dai
Tianyi Qiu
Yaodong Yang
148
40
0
04 Feb 2024
LLM-based NLG Evaluation: Current Status and Challenges
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELMLM&MA
224
41
0
02 Feb 2024
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for
  Hallucination Mitigation
Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation
Yuxin Liang
Zhuoyang Song
Hao Wang
Jiaxing Zhang
HILM
102
36
0
27 Jan 2024
Visibility into AI Agents
Visibility into AI Agents
Alan Chan
Carson Ezell
Max Kaufmann
K. Wei
Lewis Hammond
...
Nitarshan Rajkumar
David M. Krueger
Noam Kolt
Lennart Heim
Markus Anderljung
141
44
0
23 Jan 2024
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Mirac Suzgun
Adam Tauman Kalai
KELMLRMLLMAGReLM
123
78
0
23 Jan 2024
Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language
  Model Critique in Text Generation
Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation
Meng Cao
Lei Shu
Lei Yu
Yun Zhu
Nevan Wichers
Yinxiao Liu
Lei Meng
OffRLALM
64
7
0
14 Jan 2024
Towards Conversational Diagnostic AI
Towards Conversational Diagnostic AI
Tao Tu
Anil Palepu
M. Schaekermann
Khaled Saab
Jan Freyberg
...
Katherine Chou
Greg S. Corrado
Yossi Matias
Alan Karthikesalingam
Vivek Natarajan
AI4MHLM&MA
108
103
0
11 Jan 2024
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk
Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk
Dennis Ulmer
Elman Mansimov
Kaixiang Lin
Justin Sun
Xibin Gao
Yi Zhang
LLMAG
76
30
0
10 Jan 2024
The Critique of Critique
The Critique of Critique
Shichao Sun
Junlong Li
Weizhe Yuan
Ruifeng Yuan
Wenjie Li
Pengfei Liu
ELM
79
0
0
09 Jan 2024
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
  Models
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Zixiang Chen
Yihe Deng
Huizhuo Yuan
Kaixuan Ji
Quanquan Gu
SyDa
148
327
0
02 Jan 2024
LLM Harmony: Multi-Agent Communication for Problem Solving
LLM Harmony: Multi-Agent Communication for Problem Solving
Sumedh Rasal
LLMAG
66
24
0
02 Jan 2024
LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning
  Language Models
LaFFi: Leveraging Hybrid Natural Language Feedback for Fine-tuning Language Models
Qianxi Li
Yingyue Cao
Jikun Kang
Tianpei Yang
Xi Chen
Jun Jin
Matthew E. Taylor
42
2
0
31 Dec 2023
Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models
  through Intervention without Tuning
Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning
Zhongzhi Chen
Xingwu Sun
Xianfeng Jiao
Fengzong Lian
Zhanhui Kang
Di Wang
Cheng-zhong Xu
HILM
81
33
0
29 Dec 2023
Reasons to Reject? Aligning Language Models with Judgments
Reasons to Reject? Aligning Language Models with Judgments
Weiwen Xu
Deng Cai
Zhisong Zhang
Wai Lam
Shuming Shi
ALM
91
15
0
22 Dec 2023
CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update
CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update
Zhi Gao
Yuntao Du
Xintong Zhang
Xiaojian Ma
Wenjuan Han
Song-Chun Zhu
Qing Li
LLMAGVLM
135
25
0
18 Dec 2023
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak
  Supervision
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Collin Burns
Pavel Izmailov
Jan Hendrik Kirchner
Bowen Baker
Leo Gao
...
Adrien Ecoffet
Manas Joglekar
Jan Leike
Ilya Sutskever
Jeff Wu
ELM
143
299
0
14 Dec 2023
Previous
12345
Next