Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2501.05407
Cited By
On-line Policy Improvement using Monte-Carlo Search
Neural Information Processing Systems (NeurIPS), 1996
9 January 2025
Gerald Tesauro
Gregory R. Galperin
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"On-line Policy Improvement using Monte-Carlo Search"
50 / 53 papers shown
Title
Adaptive Network Security Policies via Belief Aggregation and Rollout
Kim Hammar
Yuchao Li
Tansu Alpcan
Emil C. Lupu
Dimitri P. Bertsekas
161
4
0
21 Jul 2025
A Survey on Self-play Methods in Reinforcement Learning
Chao Yu
Zelai Xu
Chengdong Ma
Chao Yu
Weijuan Tu
...
Deheng Ye
Wenbo Ding
Wenbo Ding
Yu Wang
Yu Wang
SyDa
SSL
OnRL
526
21
0
02 Aug 2024
Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming
Dimitri Bertsekas
265
13
0
02 Jun 2024
An Approximate Dynamic Programming Framework for Occlusion-Robust Multi-Object Tracking
Pratyusha Musunuru
Yuchao Li
Jamison Weber
Dimitri P. Bertsekas
199
0
0
24 May 2024
Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective
Victor-Alexandru Darvariu
Stephen Hailes
Mirco Musolesi
AI4CE
246
15
0
09 Apr 2024
Tree Search in DAG Space with Model-based Reinforcement Learning for Causal Discovery
Victor-Alexandru Darvariu
Stephen Hailes
Mirco Musolesi
CML
234
3
0
20 Oct 2023
Iterative Option Discovery for Planning, by Planning
Kenny Young
Richard S. Sutton
299
2
0
02 Oct 2023
Magnetic Field-Based Reward Shaping for Goal-Conditioned Reinforcement Learning
IEEE/CAA Journal of Automatica Sinica (IEEE/CAA JAS), 2023
Hongyu Ding
Yuan-Yan Tang
Qing Wu
Bo Wang
Chunlin Chen
Zhi Wang
285
7
0
16 Jul 2023
The Update-Equivalence Framework for Decision-Time Planning
International Conference on Learning Representations (ICLR), 2023
Samuel Sokota
Gabriele Farina
David J. Wu
Hengyuan Hu
Kevin A. Wang
J. Zico Kolter
Noam Brown
243
5
0
25 Apr 2023
A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games
Anna Winnicki
R. Srikant
302
2
0
17 Mar 2023
Multiagent Rollout with Reshuffling for Warehouse Robots Path Planning
IFAC-PapersOnLine (IFAC-PapersOnLine), 2022
William Emanuelsson
Alejandro Penacho Riveiros
Yuchao Li
Karl H. Johansson
Jonas Mårtensson
191
2
0
15 Nov 2022
Nested Search versus Limited Discrepancy Search
Tristan Cazenave
165
0
0
01 Oct 2022
Regret Analysis for Hierarchical Experts Bandit Problem
Qihan Guo
Siwei Wang
Jun Zhu
214
1
0
11 Aug 2022
A Survey on Model-based Reinforcement Learning
Science China Information Sciences (Sci. China Inf. Sci.), 2022
Fan Luo
Tian Xu
Hang Lai
Xiong-Hui Chen
Weinan Zhang
Yang Yu
OffRL
LRM
264
142
0
19 Jun 2022
Learning from Drivers to Tackle the Amazon Last Mile Routing Research Challenge
Chen Wu
Yin Song
Verdi March
Eden Duthie
316
9
0
09 May 2022
Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation
IEEE International Conference on Robotics and Automation (ICRA), 2022
Maximilian Igl
Daewoo Kim
Alex Kuefler
Paul Mougin
Punit Shah
K. Shiarlis
Drago Anguelov
Mark Palatucci
Brandyn White
Shimon Whiteson
197
75
0
06 May 2022
A Dynamic Programming Algorithm for Finding an Optimal Sequence of Informative Measurements
P. Loxley
Ka Wai Cheung
339
4
0
24 Sep 2021
Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control
Dimitri Bertsekas
AI4CE
224
60
0
20 Aug 2021
Model-Based Opponent Modeling
Xiaopeng Yu
Jiechuan Jiang
Wanpeng Zhang
Haobin Jiang
Zongqing Lu
OffRL
230
37
0
04 Aug 2021
Train on Small, Play the Large: Scaling Up Board Games with AlphaZero and GNN
Shai Ben-Assayag
Ran El-Yaniv
GNN
168
9
0
18 Jul 2021
Leveraging Tripartite Interaction Information from Live Stream E-Commerce for Improving Product Recommendation
Knowledge Discovery and Data Mining (KDD), 2021
Sanshi Lei Yu
Zhuoxuan Jiang
Dongdong Chen
Shanshan Feng
Dongsheng Li
Qi Liu
Jinfeng Yi
148
29
0
07 Jun 2021
Annotating Motion Primitives for Simplifying Action Search in Reinforcement Learning
IEEE Transactions on Emerging Topics in Computational Intelligence (IEEE TETCI), 2021
I. Sledge
Darshan W. Bryner
José C. Príncipe
294
1
0
24 Feb 2021
Monte Carlo Rollout Policy for Recommendation Systems with Dynamic User Behavior
International Conference on Communication Systems and Networks (COMSNETS), 2021
R. Meshram
Kesav Kaza
OffRL
205
3
0
08 Feb 2021
Deep Controlled Learning for Inventory Control
European Journal of Operational Research (EJOR), 2020
Tarkan Temizoz
Christina Imdahl
R. Dijkman
Douniel Lamghari-Idrissi
W. Jaarsveld
390
16
0
30 Nov 2020
On the role of planning in model-based deep reinforcement learning
Jessica B. Hamrick
A. Friesen
Feryal M. P. Behbahani
A. Guez
Fabio Viola
Sims Witherspoon
Thomas W. Anthony
Lars Buesing
Petar Velickovic
T. Weber
OffRL
305
71
0
08 Nov 2020
Lifelong Incremental Reinforcement Learning with Online Bayesian Inference
IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2020
Zhi Wang
Chunlin Chen
D. Dong
CLL
OffRL
200
61
0
28 Jul 2020
Simulation Based Algorithms for Markov Decision Processes and Multi-Action Restless Bandits
R. Meshram
Kesav Kaza
246
10
0
25 Jul 2020
Model-based Reinforcement Learning: A Survey
Thomas M. Moerland
Joost Broekens
Aske Plaat
Catholijn M. Jonker
OffRL
392
63
0
30 Jun 2020
A Unifying Framework for Reinforcement Learning and Planning
Thomas M. Moerland
Joost Broekens
Aske Plaat
Catholijn M. Jonker
OffRL
373
10
0
26 Jun 2020
Continuous Control for Searching and Planning with a Learned Model
Xuxi Yang
Werner Duvaud
Peng Wei
186
5
0
12 Jun 2020
Review, Analysis and Design of a Comprehensive Deep Reinforcement Learning Framework
Ngoc Duy Nguyen
Thanh Thi Nguyen
Hai V. Nguyen
Doug Creighton
S. Nahavandi
283
3
0
27 Feb 2020
Constrained Multiagent Rollout and Multidimensional Assignment with the Auction Algorithm
Dimitri Bertsekas
197
12
0
18 Feb 2020
Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems
IEEE Robotics and Automation Letters (RA-L), 2020
Sushmita Bhattacharya
Sahil Badyal
Thomas Wheeler
Stephanie Gil
Dimitri Bertsekas
152
37
0
11 Feb 2020
The Choice Function Framework for Online Policy Improvement
AAAI Conference on Artificial Intelligence (AAAI), 2019
Murugeswari Issakkimuthu
Alan Fern
Prasad Tadepalli
OffRL
145
1
0
01 Oct 2019
Policy Gradient Search: Online Planning and Expert Iteration without Search Trees
Thomas W. Anthony
Robert Nishihara
Philipp Moritz
Tim Salimans
John Schulman
163
30
0
07 Apr 2019
Learn a Prior for RHEA for Better Online Planning
Xinyao Tong
W. Liu
Bin Li
OffRL
220
0
0
14 Feb 2019
Learning 6-DoF Grasping and Pick-Place Using Attention Focus
Marcus Gualtieri
Robert Platt
224
61
0
15 Jun 2018
Multiple-Step Greedy Policies in Online and Approximate Reinforcement Learning
Yonathan Efroni
Gal Dalal
B. Scherrer
Shie Mannor
OffRL
227
14
0
21 May 2018
Feature-Based Aggregation and Deep Reinforcement Learning: A Survey and Some New Implementations
Dimitri Bertsekas
OffRL
226
136
0
12 Apr 2018
Beyond the One Step Greedy Approach in Reinforcement Learning
Yonathan Efroni
Gal Dalal
B. Scherrer
Shie Mannor
OffRL
241
53
0
10 Feb 2018
Learning the Reward Function for a Misspecified Model
Erik Talvitie
296
11
0
29 Jan 2018
A Survey on Compiler Autotuning using Machine Learning
Amir H. Ashouri
W. Killian
John Cavazos
G. Palermo
Cristina Silvano
361
227
0
13 Jan 2018
Imagination-Augmented Agents for Deep Reinforcement Learning
T. Weber
S. Racanière
David P. Reichert
Lars Buesing
A. Guez
...
Razvan Pascanu
Peter W. Battaglia
Demis Hassabis
David Silver
Daan Wierstra
LM&Ro
202
582
0
19 Jul 2017
Multi-Labelled Value Networks for Computer Go
IEEE Transactions on Games (TG), 2017
Tai-Lin Wu
I-Chen Wu
Guan-Wun Chen
Ting Han Wei
Tung-Yi Lai
Hung-Chun Wu
Li-Cheng Lan
159
24
0
30 May 2017
Self-Correcting Models for Model-Based Reinforcement Learning
AAAI Conference on Artificial Intelligence (AAAI), 2016
Erik Talvitie
LRM
255
97
0
19 Dec 2016
Approximate Policy Iteration for Budgeted Semantic Video Segmentation
Behrooz Mahasseni
S. Todorovic
Alan Fern
132
4
0
26 Jul 2016
Using Monte Carlo Search With Data Aggregation to Improve Robot Soccer Policies
Robot Soccer World Cup (RoboCup), 2016
Francesco Riccio
Roberto Capobianco
Daniele Nardi
148
4
0
01 Jun 2016
Classification-based Approximate Policy Iteration: Experiments and Extended Discussions
Amir-massoud Farahmand
Doina Precup
André Barreto
Mohammad Ghavamzadeh
OffRL
158
7
0
02 Jul 2014
Analysis of Watson's Strategies for Playing Jeopardy!
Journal of Artificial Intelligence Research (JAIR), 2013
Gerald Tesauro
David Gondek
J. Lenchner
James Fan
J. Prager
178
34
0
04 Feb 2014
Learning to Win by Reading Manuals in a Monte-Carlo Framework
Annual Meeting of the Association for Computational Linguistics (ACL), 2011
S. Branavan
David Silver
Regina Barzilay
159
193
0
18 Jan 2014
1
2
Next