ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.02144
  4. Cited By
Making Large Language Models Better Reasoners with Alignment

Making Large Language Models Better Reasoners with Alignment

5 September 2023
Peiyi Wang
Lei Li
Liang Chen
Feifan Song
Binghuai Lin
Yunbo Cao
Tianyu Liu
Zhifang Sui
    ALM
    LRM
ArXivPDFHTML

Papers citing "Making Large Language Models Better Reasoners with Alignment"

50 / 54 papers shown
Title
Mathematical Language Models: A Survey
Mathematical Language Models: A Survey
Wen Liu
Hanglei Hu
Jie Zhou
Yuyang Ding
Junsong Li
...
Mengliang He
Qin Chen
Bo Jiang
Aimin Zhou
Liang He
LRM
84
13
0
03 Jan 2025
Neuro-Symbolic Data Generation for Math Reasoning
Neuro-Symbolic Data Generation for Math Reasoning
Zenan Li
Zhi-Hua Zhou
Yuan Yao
Yu Li
Chun Cao
Fan Yang
Xian Zhang
Xiaoxing Ma
OffRL
LRM
83
8
0
06 Dec 2024
Synergizing LLMs and Knowledge Graphs: A Novel Approach to Software
  Repository-Related Question Answering
Synergizing LLMs and Knowledge Graphs: A Novel Approach to Software Repository-Related Question Answering
Samuel Abedu
SayedHassan Khatoonabadi
Emad Shihab
80
0
0
05 Dec 2024
Matryoshka: Learning to Drive Black-Box LLMs with LLMs
Matryoshka: Learning to Drive Black-Box LLMs with LLMs
Changhao Li
Yuchen Zhuang
Rushi Qiang
Haotian Sun
H. Dai
Chao Zhang
Bo Dai
LRM
26
4
0
28 Oct 2024
Markov Chain of Thought for Efficient Mathematical Reasoning
Markov Chain of Thought for Efficient Mathematical Reasoning
Wen Yang
Kai Fan
Minpeng Liao
LRM
47
3
0
23 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
37
0
0
14 Oct 2024
MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders
MentalArena: Self-play Training of Language Models for Diagnosis and Treatment of Mental Health Disorders
Cheng-rong Li
May Fung
Qingyun Wang
Chi Han
Manling Li
Jindong Wang
Heng Ji
AI4MH
221
0
0
09 Oct 2024
CodePMP: Scalable Preference Model Pretraining for Large Language Model
  Reasoning
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning
Huimu Yu
Xing Wu
Weidong Yin
Debing Zhang
Songlin Hu
LRM
36
5
0
03 Oct 2024
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Shengyu Feng
Xiang Kong
Shuang Ma
Aonan Zhang
Dong Yin
Chong-Jun Wang
Ruoming Pang
Yiming Yang
LRM
32
0
0
02 Oct 2024
ControlMath: Controllable Data Generation Promotes Math Generalist
  Models
ControlMath: Controllable Data Generation Promotes Math Generalist Models
Nuo Chen
Ning Wu
Jianhui Chang
Jia Li
31
3
0
20 Sep 2024
Towards a Unified View of Preference Learning for Large Language Models:
  A Survey
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Bofei Gao
Feifan Song
Yibo Miao
Zefan Cai
Zheng Yang
...
Houfeng Wang
Zhifang Sui
Peiyi Wang
Baobao Chang
Baobao Chang
55
12
0
04 Sep 2024
Making Large Language Models Better Planners with Reasoning-Decision
  Alignment
Making Large Language Models Better Planners with Reasoning-Decision Alignment
Zhijian Huang
Tao Tang
Shaoxiang Chen
Sihao Lin
Zequn Jie
Lin Ma
Guangrun Wang
Xiaodan Liang
56
10
0
25 Aug 2024
BAPO: Base-Anchored Preference Optimization for Personalized Alignment
  in Large Language Models
BAPO: Base-Anchored Preference Optimization for Personalized Alignment in Large Language Models
Gihun Lee
Minchan Jeong
Yujin Kim
Hojung Jung
Jaehoon Oh
Sangmook Kim
Se-Young Yun
35
1
0
30 Jun 2024
Hybrid Alignment Training for Large Language Models
Hybrid Alignment Training for Large Language Models
Chenglong Wang
Hang Zhou
Kaiyan Chang
Bei Li
Yongyu Mu
Tong Xiao
Tongran Liu
Jingbo Zhu
43
4
0
21 Jun 2024
Finding Safety Neurons in Large Language Models
Finding Safety Neurons in Large Language Models
Jianhui Chen
Xiaozhi Wang
Zijun Yao
Yushi Bai
Lei Hou
Juanzi Li
KELM
LLMSV
48
13
0
20 Jun 2024
LLM Critics Help Catch Bugs in Mathematics: Towards a Better
  Mathematical Verifier with Natural Language Feedback
LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
Bofei Gao
Zefan Cai
Runxin Xu
Peiyi Wang
Ce Zheng
...
Chang Zhou
Wen Xiao
Junjie Hu
Tianyu Liu
Baobao Chang
LRM
48
17
0
20 Jun 2024
A Survey on Human Preference Learning for Large Language Models
A Survey on Human Preference Learning for Large Language Models
Ruili Jiang
Kehai Chen
Xuefeng Bai
Zhixuan He
Juntao Li
Muyun Yang
Tiejun Zhao
Liqiang Nie
Min Zhang
49
8
0
17 Jun 2024
Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from
  Imperfect Teacher Models in Low-Budget Scenarios
Teaching-Assistant-in-the-Loop: Improving Knowledge Distillation from Imperfect Teacher Models in Low-Budget Scenarios
Yuhang Zhou
Wei Ai
43
5
0
08 Jun 2024
AgentGym: Evolving Large Language Model-based Agents across Diverse
  Environments
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Zhiheng Xi
Yiwen Ding
Wenxiang Chen
Boyang Hong
Honglin Guo
...
Qi Zhang
Xipeng Qiu
Xuanjing Huang
Zuxuan Wu
Yu-Gang Jiang
LLMAG
LM&Ro
38
29
0
06 Jun 2024
Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models
Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models
Leonardo Ranaldi
André Freitas
LRM
ReLM
42
10
0
01 May 2024
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Yunxiang Zhang
Muhammad Khalifa
Lajanugen Logeswaran
Jaekyeom Kim
Moontae Lee
Honglak Lee
Lu Wang
LRM
KELM
ReLM
33
31
0
26 Apr 2024
PARAMANU-GANITA: Can Small Math Language Models Rival with Large Language Models on Mathematical Reasoning?
PARAMANU-GANITA: Can Small Math Language Models Rival with Large Language Models on Mathematical Reasoning?
Mitodru Niyogi
Arnab Bhattacharya
LRM
ReLM
49
0
0
22 Apr 2024
A Theory for Length Generalization in Learning to Reason
A Theory for Length Generalization in Learning to Reason
Changnan Xiao
Bing Liu
LRM
47
9
0
31 Mar 2024
Scaling Data Diversity for Fine-Tuning Language Models in Human
  Alignment
Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment
Feifan Song
Bowen Yu
Hao Lang
Haiyang Yu
Fei Huang
Houfeng Wang
Yongbin Li
ALM
43
11
0
17 Mar 2024
Trial and Error: Exploration-Based Trajectory Optimization for LLM
  Agents
Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents
Yifan Song
Da Yin
Xiang Yue
Jie Huang
Sujian Li
Bill Yuchen Lin
45
67
0
04 Mar 2024
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
Arka Pal
Deep Karkhanis
Samuel Dooley
Manley Roberts
Siddartha Naidu
Colin White
OSLM
46
129
0
20 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELM
VLM
46
103
0
20 Feb 2024
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Yougang Lyu
Lingyong Yan
Shuaiqiang Wang
Haibo Shi
Dawei Yin
Pengjie Ren
Zhumin Chen
Maarten de Rijke
Zhaochun Ren
24
5
0
17 Feb 2024
ICDPO: Effectively Borrowing Alignment Capability of Others via
  In-context Direct Preference Optimization
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Feifan Song
Yuxuan Fan
Xin Zhang
Peiyi Wang
Houfeng Wang
32
8
0
14 Feb 2024
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
BBox-Adapter: Lightweight Adapting for Black-Box Large Language Models
Haotian Sun
Yuchen Zhuang
Wei Wei
Chao Zhang
Bo Dai
27
3
0
13 Feb 2024
V-STaR: Training Verifiers for Self-Taught Reasoners
V-STaR: Training Verifiers for Self-Taught Reasoners
Arian Hosseini
Xingdi Yuan
Nikolay Malkin
Rameswar Panda
Alessandro Sordoni
Rishabh Agarwal
ReLM
LRM
54
106
0
09 Feb 2024
CultureLLM: Incorporating Cultural Differences into Large Language
  Models
CultureLLM: Incorporating Cultural Differences into Large Language Models
Cheng-rong Li
Mengzhou Chen
Jindong Wang
Sunayana Sitaram
Xing Xie
VLM
51
18
0
09 Feb 2024
Training Large Language Models for Reasoning through Reverse Curriculum
  Reinforcement Learning
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Zhiheng Xi
Wenxiang Chen
Boyang Hong
Senjie Jin
Rui Zheng
...
Xinbo Zhang
Peng Sun
Tao Gui
Qi Zhang
Xuanjing Huang
LRM
42
22
0
08 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
  Language Models
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Y. K. Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
51
746
0
05 Feb 2024
Improving Large Language Models via Fine-grained Reinforcement Learning
  with Minimum Editing Constraint
Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint
Zhipeng Chen
Kun Zhou
Wayne Xin Zhao
Junchen Wan
Fuzheng Zhang
Di Zhang
Ji-Rong Wen
KELM
39
33
0
11 Jan 2024
Self-Contrast: Better Reflection Through Inconsistent Solving
  Perspectives
Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives
Wenqi Zhang
Yongliang Shen
Linjuan Wu
Qiuying Peng
Jun Wang
Yueting Zhuang
Weiming Lu
LRM
LLMAG
45
51
0
04 Jan 2024
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human
  Annotations
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
Peiyi Wang
Lei Li
Zhihong Shao
R. X. Xu
Damai Dai
Yifei Li
Deli Chen
Y.Wu
Zhifang Sui
AIMat
LRM
ALM
53
279
0
14 Dec 2023
Conditions for Length Generalization in Learning Reasoning Skills
Conditions for Length Generalization in Learning Reasoning Skills
Changnan Xiao
Bing Liu
LRM
40
7
0
22 Nov 2023
Explanation-aware Soft Ensemble Empowers Large Language Model In-context
  Learning
Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning
Yue Yu
Jiaming Shen
Tianqi Liu
Zhen Qin
Jing Nathan Yan
Jialu Liu
Chao Zhang
Michael Bendersky
54
6
0
13 Nov 2023
Knowledgeable Preference Alignment for LLMs in Domain-specific Question
  Answering
Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
Yichi Zhang
Zhuo Chen
Yin Fang
Yanxi Lu
Fangming Li
Wen Zhang
Hua-zeng Chen
66
30
0
11 Nov 2023
Amortizing intractable inference in large language models
Amortizing intractable inference in large language models
Marvin Schmitt
Moksh Jain
Daniel Habermann
Younesse Kaddar
Ullrich Kothe
Stefan T. Radev
Nikolay Malkin
AIFin
BDL
32
49
0
06 Oct 2023
Towards End-to-End Embodied Decision Making via Multi-modal Large
  Language Model: Explorations with GPT4-Vision and Beyond
Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond
Liang Chen
Yichi Zhang
Shuhuai Ren
Haozhe Zhao
Zefan Cai
Yuchi Wang
Peiyi Wang
Tianyu Liu
Baobao Chang
LM&Ro
LLMAG
33
41
0
03 Oct 2023
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language
  Models
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
L. Yu
Weisen Jiang
Han Shi
Jincheng Yu
Zhengying Liu
Yu Zhang
James T. Kwok
Zheng Li
Adrian Weller
Weiyang Liu
OSLM
LRM
50
341
0
21 Sep 2023
MMICL: Empowering Vision-language Model with Multi-Modal In-Context
  Learning
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning
Haozhe Zhao
Zefan Cai
Shuzheng Si
Xiaojian Ma
Kaikai An
Liang Chen
Zixuan Liu
Sheng Wang
Wenjuan Han
Baobao Chang
MLLM
VLM
28
134
0
14 Sep 2023
Statistical Rejection Sampling Improves Preference Optimization
Statistical Rejection Sampling Improves Preference Optimization
Tianqi Liu
Yao-Min Zhao
Rishabh Joshi
Misha Khalman
Mohammad Saleh
Peter J. Liu
Jialu Liu
61
215
0
13 Sep 2023
MAmmoTH: Building Math Generalist Models through Hybrid Instruction
  Tuning
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Xiang Yue
Xingwei Qu
Ge Zhang
Yao Fu
Wenhao Huang
Huan Sun
Yu-Chuan Su
Wenhu Chen
AIMat
LRM
85
369
0
11 Sep 2023
No Train Still Gain. Unleash Mathematical Reasoning of Large Language
  Models with Monte Carlo Tree Search Guided by Energy Function
No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function
Haotian Xu
LRM
38
12
0
01 Sep 2023
When Do Program-of-Thoughts Work for Reasoning?
When Do Program-of-Thoughts Work for Reasoning?
Zhen Bi
Ningyu Zhang
Yinuo Jiang
Shumin Deng
Guozhou Zheng
Huajun Chen
LRM
38
20
0
29 Aug 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less
  Training Data and Smaller Model Sizes
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
222
506
0
03 May 2023
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
275
2,549
0
06 Oct 2022
12
Next