Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.05108
Cited By
v1
v2
v3 (latest)
Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning
7 April 2025
Anja Surina
Amin Mansouri
Lars Quaedvlieg
Amal Seddas
Maryna Viazovska
Emmanuel Abbe
Çağlar Gülçehre
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning"
43 / 43 papers shown
Title
AlphaEvolve: A coding agent for scientific and algorithmic discovery
Alexander Novikov
Ngan Vu
Marvin Eisenberger
Emilien Dupont
Po-Sen Huang
...
George Holland
Alex Davies
Sebastian Nowozin
Pushmeet Kohli
Matej Balog
62
17
0
16 Jun 2025
LLM-Meta-SR: Learning to Evolve Selection Operators for Symbolic Regression
Hengzhe Zhang
Qi Chen
Bing Xue
Mengjie Zhang
66
0
0
24 May 2025
NeoBERT: A Next-Generation BERT
Lola Le Breton
Quentin Fournier
Mariam El Mezouar
John X. Morris
Sarath Chandar
AI4TS
140
1
0
26 Feb 2025
Amplifying human performance in combinatorial competitive programming
Petar Velickovic
Alex Vitvitskyi
Larisa Markeeva
Borja Ibarz
Lars Buesing
Matej Balog
Alexander Novikov
87
4
0
29 Nov 2024
Discovering Preference Optimization Algorithms with and for Large Language Models
Chris Xiaoxuan Lu
Samuel Holt
Claudio Fanconi
Alex J. Chan
Jakob Foerster
M. Schaar
R. T. Lange
OffRL
114
18
0
12 Jun 2024
Iterative Reasoning Preference Optimization
Richard Yuanzhe Pang
Weizhe Yuan
Kyunghyun Cho
He He
Sainbayar Sukhbaatar
Jason Weston
LRM
161
138
0
30 Apr 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Marah Abdin
Sam Ade Jacobs
A. A. Awan
J. Aneja
Ahmed Hassan Awadallah
...
Li Zhang
Yi Zhang
Yue Zhang
Yunan Zhang
Xiren Zhou
LRM
ALM
209
1,275
0
22 Apr 2024
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello
Daniel Guo
Rémi Munos
Mark Rowland
Yunhao Tang
...
Michal Valko
Tianqi Liu
Rishabh Joshi
Zeyu Zheng
Bilal Piot
110
67
0
13 Mar 2024
Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model
Fei Liu
Xialiang Tong
Mingxuan Yuan
Xi Lin
Fu Luo
Zhenkun Wang
Zhichao Lu
Qingfu Zhang
VLM
131
84
0
04 Jan 2024
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Avi Singh
John D. Co-Reyes
Rishabh Agarwal
Ankesh Anand
Piyush Patil
...
Yamini Bansal
Ethan Dyer
Behnam Neyshabur
Jascha Narain Sohl-Dickstein
Noah Fiedel
ALM
LRM
ReLM
SyDa
278
190
0
11 Dec 2023
Algorithm Evolution Using Large Language Model
Fei Liu
Xialiang Tong
Mingxuan Yuan
Qingfu Zhang
83
46
0
26 Nov 2023
Large Language Models as Evolutionary Optimizers
Shengcai Liu
Caishun Chen
Xinghua Qu
Jiaheng Zhang
Yew-Soon Ong
102
108
0
29 Oct 2023
Eureka: Human-Level Reward Design via Coding Large Language Models
Yecheng Jason Ma
William Liang
Guanzhi Wang
De-An Huang
Osbert Bastani
Dinesh Jayaraman
Yuke Zhu
Linxi Fan
A. Anandkumar
85
325
0
19 Oct 2023
Neural Combinatorial Optimization with Heavy Decoder: Toward Large Scale Generalization
Fu Luo
Xi Lin
Fei Liu
Qingfu Zhang
Zhenkun Wang
110
79
0
12 Oct 2023
Understanding the Effects of RLHF on LLM Generalisation and Diversity
Robert Kirk
Ishita Mediratta
Christoforos Nalmpantis
Jelena Luketina
Eric Hambro
Edward Grefenstette
Roberta Raileanu
AI4CE
ALM
212
150
0
10 Oct 2023
Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution
Chrisantha Fernando
Dylan Banarse
Henryk Michalewski
Simon Osindero
Tim Rocktaschel
LLMAG
ReLM
LRM
105
211
0
28 Sep 2023
Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints
Chaoqi Wang
Yibo Jiang
Yuguang Yang
Han Liu
Yuxin Chen
90
108
0
28 Sep 2023
Large Language Models as Optimizers
Chengrun Yang
Xuezhi Wang
Yifeng Lu
Hanxiao Liu
Quoc V. Le
Denny Zhou
Xinyun Chen
ODL
154
435
0
07 Sep 2023
Reinforced Self-Training (ReST) for Language Modeling
Çağlar Gülçehre
T. Paine
S. Srinivasan
Ksenia Konyushkova
L. Weerts
...
Chenjie Gu
Wolfgang Macherey
Arnaud Doucet
Orhan Firat
Nando de Freitas
OffRL
129
309
0
17 Aug 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
158
534
0
27 Jul 2023
Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX
Clément Bonnet
Daniel Luo
Donal Byrne
Shikha Surana
Sasha Abramowitz
...
Siddarth S. Singh
Daniel Furelos-Blanco
Victor Le
Arnu Pretorius
Alexandre Laterre
119
31
0
16 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
405
4,189
0
29 May 2023
Evolution through Large Models
Joel Lehman
Jonathan Gordon
Shawn Jain
Kamal Ndousse
Cathy Yeh
Kenneth O. Stanley
100
94
0
17 Jun 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
982
13,285
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
1.1K
9,823
0
28 Jan 2022
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
780
10,647
0
17 Jun 2021
Efficient Active Search for Combinatorial Optimization Problems
André Hottung
Yeong-Dae Kwon
Kevin Tierney
91
97
0
09 Jun 2021
Deep Policy Dynamic Programming for Vehicle Routing Problems
W. Kool
H. V. Hoof
J. Gromicho
Max Welling
115
125
0
23 Feb 2021
Generalize a Small Pre-trained Model to Arbitrarily Large TSP Instances
Zhang-Hua Fu
K. Qiu
H. Zha
TPM
99
184
0
19 Dec 2020
Learning to summarize from human feedback
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
304
2,195
0
02 Sep 2020
Learning the Travelling Salesperson Problem Requires Rethinking Generalization
Chaitanya K. Joshi
Quentin Cappart
Louis-Martin Rousseau
T. Laurent
264
122
0
12 Jun 2020
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
582
2,051
0
04 May 2020
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI OpenAI
:
Christopher Berner
Greg Brockman
Brooke Chan
...
Szymon Sidor
Ilya Sutskever
Jie Tang
Filip Wolski
Susan Zhang
GNN
VLM
CLL
AI4CE
LRM
181
1,840
0
13 Dec 2019
Neural Large Neighborhood Search for the Capacitated Vehicle Routing Problem
André Hottung
Kevin Tierney
106
149
0
21 Nov 2019
An Efficient Graph Convolutional Network Technique for the Travelling Salesman Problem
Chaitanya K. Joshi
T. Laurent
Xavier Bresson
GNN
124
378
0
04 Jun 2019
The Curious Case of Neural Text Degeneration
Ari Holtzman
Jan Buys
Li Du
Maxwell Forbes
Yejin Choi
215
3,216
0
22 Apr 2019
Learning to Perform Local Rewriting for Combinatorial Optimization
Xinyun Chen
Yuandong Tian
NAI
OffRL
159
352
0
30 Sep 2018
Attention, Learn to Solve Routing Problems!
W. Kool
H. V. Hoof
Max Welling
141
1,237
0
22 Mar 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
698
19,363
0
20 Jul 2017
Neural Combinatorial Optimization with Reinforcement Learning
Irwan Bello
Hieu H. Pham
Quoc V. Le
Mohammad Norouzi
Samy Bengio
175
1,501
0
29 Nov 2016
An overview of gradient descent optimization algorithms
Sebastian Ruder
ODL
256
6,221
0
15 Sep 2016
Prioritized Experience Replay
Tom Schaul
John Quan
Ioannis Antonoglou
David Silver
OffRL
251
3,809
0
18 Nov 2015
Pointer Networks
Oriol Vinyals
Meire Fortunato
Navdeep Jaitly
240
3,071
0
09 Jun 2015
1