Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.11610
Cited By
Large Language Models Can Self-Improve
20 October 2022
Jiaxin Huang
S. Gu
Le Hou
Yuexin Wu
Xuezhi Wang
Hongkun Yu
Jiawei Han
ReLM
AI4MH
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Large Language Models Can Self-Improve"
50 / 410 papers shown
Title
STAIR: Improving Safety Alignment with Introspective Reasoning
Y. Zhang
Siyuan Zhang
Yao Huang
Zeyu Xia
Zhengwei Fang
Xiao Yang
Ranjie Duan
Dong Yan
Yinpeng Dong
Jun Zhu
LRM
LLMSV
56
3
0
04 Feb 2025
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Nayoung Lee
Ziyang Cai
Avi Schwarzschild
Kangwook Lee
Dimitris Papailiopoulos
ReLM
VLM
LRM
AI4CE
80
4
0
03 Feb 2025
CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering
Yumeng Wang
Zhiyuan Fan
Q. Wang
May Fung
Heng Ji
80
1
0
30 Jan 2025
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Zhenhailong Wang
Haiyang Xu
Junyang Wang
Xi Zhang
Ming Yan
J. Zhang
Fei Huang
Heng Ji
43
9
0
20 Jan 2025
QualityFlow: An Agentic Workflow for Program Synthesis Controlled by LLM Quality Checks
Yaojie Hu
Qiang Zhou
Qihong Chen
Xiaopeng Li
Linbo Liu
Dejiao Zhang
Amit Kachroo
Talha Oz
Omer Tripp
66
4
0
20 Jan 2025
Aligning Instruction Tuning with Pre-training
Yiming Liang
Tianyu Zheng
Xinrun Du
Ge Zhang
J. Liu
...
Zhaoxiang Zhang
Wenhao Huang
Jiajun Zhang
Xiang Yue
Jiajun Zhang
86
1
0
16 Jan 2025
Cascaded Self-Evaluation Augmented Training for Lightweight Multimodal LLMs
Zheqi Lv
Wenkai Wang
Jiawei Wang
Shengyu Zhang
Fei Wu
LRM
ReLM
51
0
0
10 Jan 2025
Understanding Before Reasoning: Enhancing Chain-of-Thought with Iterative Summarization Pre-Prompting
Dong-Hai Zhu
Yu-Jie Xiong
Jia-Chen Zhang
Xi-Jiong Xie
Chun-Ming Xia
ReLM
LRM
37
0
0
08 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Yifan Li
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
88
11
0
06 Jan 2025
Nash CoT: Multi-Path Inference with Preference Equilibrium
Ziqi Zhang
Cunxiang Wang
Xiong Xiao
Yue Zhang
Donglin Wang
LRM
39
1
0
31 Dec 2024
Malware Classification using a Hybrid Hidden Markov Model-Convolutional Neural Network
Ritik Mehta
Olha Jurecková
Mark Stamp
59
0
0
25 Dec 2024
Robust Semi-Supervised Learning in Open Environments
Lan-Zhe Guo
Lin Jia
Jie-Jing Shao
Yu-Feng Li
OffRL
31
2
0
24 Dec 2024
Diving into Self-Evolving Training for Multimodal Reasoning
Wei Liu
Junlong Li
Xiwen Zhang
Fan Zhou
Yu Cheng
Junxian He
ReLM
LRM
41
11
0
23 Dec 2024
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Weihao Zeng
Yuzhen Huang
Lulu Zhao
Yijun Wang
Zifei Shan
Junxian He
LRM
35
7
0
23 Dec 2024
PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection
Sepideh Mamooler
Syrielle Montariol
Alexander Mathis
Antoine Bosselut
90
1
0
16 Dec 2024
Enhancing Mathematical Reasoning in LLMs with Background Operators
Jiajun Chen
Yik-Cheung Tam
LRM
68
0
0
05 Dec 2024
Towards Adaptive Mechanism Activation in Language Agent
Ziyang Huang
Jun Zhao
Kang-Jun Liu
LLMAG
AI4CE
78
0
0
01 Dec 2024
VideoSAVi: Self-Aligned Video Language Models without Human Supervision
Yogesh Kulkarni
Pooyan Fazli
VLM
103
2
0
01 Dec 2024
MATATA: Weakly Supervised End-to-End MAthematical Tool-Augmented Reasoning for Tabular Applications
Vishnou Vinayagame
Gregory Senay
Luis Martí
LRM
ReLM
63
0
0
28 Nov 2024
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Dawei Li
Bohan Jiang
Liangjie Huang
Alimohammad Beigi
Chengshuai Zhao
...
Canyu Chen
Tianhao Wu
Kai Shu
Lu Cheng
Huan Liu
ELM
AILaw
120
67
0
25 Nov 2024
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Yu Zhao
Huifeng Yin
Bo Zeng
Hao Wang
Tianqi Shi
Chenyang Lyu
Longyue Wang
Weihua Luo
Kaifu Zhang
ReLM
LRM
AI4CE
74
58
0
21 Nov 2024
Clustering Algorithms and RAG Enhancing Semi-Supervised Text Classification with Large LLMs
Shan Zhong
Jiahao Zeng
Yongxin Yu
Bohong Lin
34
1
0
09 Nov 2024
GraphXForm: Graph transformer for computer-aided molecular design
Jonathan Pirnay
Jan G. Rittig
Alexander B. Wolf
Martin Grohe
Jakob Burger
Alexander Mitsos
D. G. Grimm
AI4CE
51
1
0
03 Nov 2024
Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?
Ioannis Tsiamas
Matthias Sperber
Andrew Finch
Sarthak Garg
31
0
0
31 Oct 2024
Vision-Language Models Can Self-Improve Reasoning via Reflection
Kanzhi Cheng
Yantao Li
Fangzhi Xu
Jianbing Zhang
Hao Zhou
Yang Liu
ReLM
LRM
47
17
0
30 Oct 2024
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?
Zhengkai Lin
Z. Fu
Kai Liu
Liang Xie
Binbin Lin
Wenxiao Wang
D. Cai
Yue Wu
Jieping Ye
LRM
25
3
0
24 Oct 2024
C
2
C^2
C
2
: Scalable Auto-Feedback for LLM-based Chart Generation
Woosung Koh
Jang Han Yoon
M. Lee
Youngjin Song
Jaegwan Cho
Jaehyun Kang
Taehyeon Kim
Se-Young Yun
Youngjae Yu
B. Lee
42
0
0
24 Oct 2024
Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data
Anup Shirgaonkar
Nikhil Pandey
Nazmiye Ceren Abay
Tolga Aktas
Vijay Aski
ALM
SyDa
29
0
0
24 Oct 2024
Improving Model Factuality with Fine-grained Critique-based Evaluator
Yiqing Xie
Wenxuan Zhou
Pradyot Prakash
Di Jin
Yuning Mao
...
Sinong Wang
Han Fang
Carolyn Rose
Daniel Fried
Hejia Zhang
HILM
33
5
0
24 Oct 2024
CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking
Chia-Hsuan Lee
Hao Cheng
Mari Ostendorf
LRM
26
0
0
23 Oct 2024
Who is Undercover? Guiding LLMs to Explore Multi-Perspective Team Tactic in the Game
Ruiqi Dong
Zhixuan Liao
Guangwei Lai
Yuhan Ma
Danni Ma
Chenyou Fan
LLMAG
34
0
0
20 Oct 2024
A Survey on Data Synthesis and Augmentation for Large Language Models
Ke Wang
Jiahui Zhu
Minjie Ren
Z. Liu
Shiwei Li
...
Chenkai Zhang
Xiaoyu Wu
Qiqi Zhan
Qingjie Liu
Yunhong Wang
SyDa
40
15
0
16 Oct 2024
CREAM: Consistency Regularized Self-Rewarding Language Models
Z. Wang
Weilei He
Zhiyuan Liang
Xuchao Zhang
Chetan Bansal
Ying Wei
Weitong Zhang
Huaxiu Yao
ALM
101
7
0
16 Oct 2024
Toolken+: Improving LLM Tool Usage with Reranking and a Reject Option
Konstantin Yakovlev
Sergey I. Nikolenko
A. Bout
21
0
0
15 Oct 2024
Self-adaptive Multimodal Retrieval-Augmented Generation
Wenjia Zhai
VLM
39
0
0
15 Oct 2024
Browsing without Third-Party Cookies: What Do You See?
Maxwell Lin
Shihan Lin
Helen Wu
Karen Wang
Xiaowei Yang
BDL
53
0
0
14 Oct 2024
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Enyu Zhou
Guodong Zheng
B. Wang
Zhiheng Xi
Shihan Dou
...
Yurong Mou
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
ALM
59
18
0
13 Oct 2024
COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement
Yuxi Xie
Anirudh Goyal
Xiaobao Wu
Xunjian Yin
Xiao Xu
Min-Yen Kan
Liangming Pan
William Yang Wang
LRM
83
1
0
12 Oct 2024
Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines
Junyu Lai
Jiahe Xu
Yao Yang
Yunpeng Huang
Chun Cao
Jingwei Xu
LRM
37
2
0
10 Oct 2024
O1 Replication Journey: A Strategic Progress Report -- Part 1
Yiwei Qin
Xuefeng Li
Haoyang Zou
Yixiu Liu
Shijie Xia
...
Yixin Ye
Weizhe Yuan
Hector Liu
Y. Li
Pengfei Liu
VLM
40
67
0
08 Oct 2024
Rational Metareasoning for Large Language Models
C. Nicolò De Sabbata
T. Sumers
Thomas L. Griffiths
ReLM
LRM
28
1
0
07 Oct 2024
Mirror-Consistency: Harnessing Inconsistency in Majority Voting
Siyuan Huang
Zhiyuan Ma
Jintao Du
Changhua Meng
Weiqiang Wang
Zhouhan Lin
LRM
29
3
0
07 Oct 2024
Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks
Jiayi He
Hehai Lin
Q. Wang
Yi Ren Fung
Heng Ji
ReLM
LRM
101
4
0
05 Oct 2024
SELU: Self-Learning Embodied MLLMs in Unknown Environments
Boyu Li
Haobin Jiang
Ziluo Ding
Xinrun Xu
Haoran Li
Dongbin Zhao
Zongqing Lu
LRM
49
2
0
04 Oct 2024
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning
Huimu Yu
Xing Wu
Weidong Yin
Debing Zhang
Songlin Hu
LRM
31
5
0
03 Oct 2024
Better Instruction-Following Through Minimum Bayes Risk
Ian Wu
Patrick Fernandes
Amanda Bertsch
Seungone Kim
Sina Pakazad
Graham Neubig
48
9
0
03 Oct 2024
ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
Xiangyu Peng
Congying Xia
Xinyi Yang
Caiming Xiong
Chien-Sheng Wu
Chen Xing
LRM
48
2
0
03 Oct 2024
Disentangling Latent Shifts of In-Context Learning Through Self-Training
Josip Jukić
Jan Snajder
21
0
0
02 Oct 2024
TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking
Danqing Wang
Jianxin Ma
Fei Fang
Lei Li
LLMAG
LRM
149
0
0
02 Oct 2024
Multimodal Auto Validation For Self-Refinement in Web Agents
Ruhana Azam
Tamer Abuelsaad
Aditya Vempaty
Ashish Jagmohan
26
1
0
01 Oct 2024
Previous
1
2
3
4
5
6
7
8
9
Next