Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.01815
Cited By
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
5 December 2017
David Silver
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
Matthew Lai
A. Guez
Marc Lanctot
Laurent Sifre
D. Kumaran
T. Graepel
Timothy Lillicrap
Karen Simonyan
Demis Hassabis
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm"
50 / 266 papers shown
Title
Scaling Artificial Intelligence for Digital Wargaming in Support of Decision-Making
Scotty Black
Christian J. Darken
19
2
0
08 Feb 2024
The RL/LLM Taxonomy Tree: Reviewing Synergies Between Reinforcement Learning and Large Language Models
M. Pternea
Prerna Singh
Abir Chakraborty
Y. Oruganti
M. Milletarí
Sayli Bapat
Kebei Jiang
OffRL
36
7
0
02 Feb 2024
Layered and Staged Monte Carlo Tree Search for SMT Strategy Synthesis
Zhengyang Lu
Stefan Siemer
Piyush Jha
Joel D. Day
Florin Manea
Vijay Ganesh
14
1
0
30 Jan 2024
Generalized Nested Rollout Policy Adaptation with Limited Repetitions
Tristan Cazenave
8
3
0
18 Jan 2024
From Images to Connections: Can DQN with GNNs learn the Strategic Game of Hex?
Yannik Keller
Jannis Blüml
Gopika Sudhakaran
Kristian Kersting
GNN
37
0
0
22 Nov 2023
Runtime Verification of Learning Properties for Reinforcement Learning Algorithms
T. Mannucci
Julio de Oliveira Filho
OffRL
8
0
0
16 Nov 2023
Optimal Robotic Assembly Sequence Planning: A Sequential Decision-Making Approach
Kartik Nagpal
Negar Mehr
35
0
0
26 Oct 2023
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games
Yang Li
Kun Xiong
Yingping Zhang
Jiangcheng Zhu
Stephen Marcus McAleer
Wei Pan
Jun Wang
Zonghong Dai
Yaodong Yang
41
2
0
09 Aug 2023
Towards General Game Representations: Decomposing Games Pixels into Content and Style
C. Trivedi
Konstantinos Makantasis
Antonios Liapis
Georgios N. Yannakakis
OCL
43
3
0
20 Jul 2023
Dynamic Feature-based Deep Reinforcement Learning for Flow Control of Circular Cylinder with Sparse Surface Pressure Sensing
Qiulei Wang
Lei Yan
Gang Hu
Wenli Chen
Jean Rabault
B. R. Noack
AI4CE
23
24
0
05 Jul 2023
Model-Based Simulation for Optimising Smart Reply
Benjamin Towle
Ke Zhou
32
1
0
26 May 2023
Discovering Individual Rewards in Collective Behavior through Inverse Multi-Agent Reinforcement Learning
Daniel Waelchli
Pascal Weber
Petros Koumoutsakos
AI4CE
27
4
0
17 May 2023
Adaptive Feature Fusion: Enhancing Generalization in Deep Learning Models
Neelesh Mungoli
28
23
0
04 Apr 2023
Online augmentation of learned grasp sequence policies for more adaptable and data-efficient in-hand manipulation
E. Gordon
Rana Soltani-Zarrin
OffRL
29
5
0
04 Apr 2023
Managing power grids through topology actions: A comparative study between advanced rule-based and reinforcement learning agents
Malte Lehna
J. Viebahn
Christoph Scholz
Antoine Marot
Sven Tomforde
32
19
0
03 Apr 2023
Meta-Learning Parameterized First-Order Optimizers using Differentiable Convex Optimization
Tanmay Gautam
Samuel Pfrommer
Somayeh Sojoudi
26
2
0
29 Mar 2023
A New Policy Iteration Algorithm For Reinforcement Learning in Zero-Sum Markov Games
Anna Winnicki
R. Srikant
37
1
0
17 Mar 2023
A Reinforcement Learning Approach for Scheduling Problems With Improved Generalization Through Order Swapping
Deepak Vivekanandan
Samuel Wirth
Patrick Karlbauer
Noah Klarmann
31
6
0
27 Feb 2023
TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play
Fanqing Lin
Shiyu Huang
Tim Pearce
Wenze Chen
Weijuan Tu
26
17
0
15 Feb 2023
Energy Efficiency of Training Neural Network Architectures: An Empirical Study
Yi Xu
Silverio Martínez-Fernández
Matias Martinez
Xavier Franch
28
13
0
02 Feb 2023
Visual Imitation Learning with Patch Rewards
Minghuan Liu
Tairan He
Weinan Zhang
Shuicheng Yan
Zhongwen Xu
SSL
22
13
0
02 Feb 2023
Policy-Value Alignment and Robustness in Search-based Multi-Agent Learning
Niko A. Grupen
M. Hanlon
Alexis Hao
Daniel D. Lee
B. Selman
27
0
0
27 Jan 2023
PushWorld: A benchmark for manipulation planning with tools and movable obstacles
Ken Kansky
Skanda Vaidyanath
Scott Swingle
Xinghua Lou
Miguel Lazaro-Gredilla
Dileep George
31
4
0
24 Jan 2023
Switchable Lightweight Anti-symmetric Processing (SLAP) with CNN Outspeeds Data Augmentation by Smaller Sample -- Application in Gomoku Reinforcement Learning
Chi-Hang Suen
Eduardo Alonso
23
0
0
11 Jan 2023
Character Simulation Using Imitation Learning With Game Engine Physics
Joao Rodrigues
R. Nóbrega
AI4CE
19
2
0
05 Jan 2023
Generalised agent for solving higher board states of tic tac toe using Reinforcement Learning
Bhavuk Kalra
16
0
0
23 Dec 2022
Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning
Christopher T. Lengerich
Gabriel Synnaeve
Amy Zhang
Hugh Leather
Kurt Shuster
Franccois Charton
Charysse Redwood
SSL
OffRL
32
1
0
21 Dec 2022
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
121
1,495
0
15 Dec 2022
Deep Incubation: Training Large Models by Divide-and-Conquering
Zanlin Ni
Yulin Wang
Jiangwei Yu
Haojun Jiang
Yu Cao
Gao Huang
VLM
23
11
0
08 Dec 2022
ISAACS: Iterative Soft Adversarial Actor-Critic for Safety
Kai Hsu
D. Nguyen
J. F. Fisac
25
30
0
06 Dec 2022
Actively Learning Costly Reward Functions for Reinforcement Learning
André Eberhard
Houssam Metni
G. Fahland
A. Stroh
Pascal Friederich
OffRL
41
0
0
23 Nov 2022
Credit-cognisant reinforcement learning for multi-agent cooperation
F. Bredell
S. M. I. H. A. Engelbrecht
M. I. J. C. Schoeman
13
0
0
18 Nov 2022
Deep Instance Segmentation and Visual Servoing to Play Jenga with a Cost-Effective Robotic System
Luca Marchionna
G. Pugliese
Mauro Martini
Simone Angarano
Francesco Salvetti
Marcello Chiaberge
49
3
0
15 Nov 2022
Progress and summary of reinforcement learning on energy management of MPS-EV
Jincheng Hu
Yang Lin
Liang Chu
Zhuoran Hou
Jihan Li
Jingjing Jiang
Yuanjian Zhang
25
12
0
08 Nov 2022
Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation
Siddharth Nayak
Kenneth M. F. Choi
Wenqi Ding
Sydney I. Dolan
Karthik Gopalakrishnan
H. Balakrishnan
17
29
0
03 Nov 2022
Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration
Mesut Yang
Micah Carroll
Anca Dragan
39
13
0
03 Nov 2022
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
30
74
0
26 Oct 2022
Will we run out of data? Limits of LLM scaling based on human-generated data
Pablo Villalobos
A. Ho
J. Sevilla
T. Besiroglu
Lennart Heim
Marius Hobbhahn
ALM
49
114
0
26 Oct 2022
CEIP: Combining Explicit and Implicit Priors for Reinforcement Learning with Demonstrations
Kai Yan
Alex Schwing
Yu-xiong Wang
OffRL
30
2
0
18 Oct 2022
The Debate Over Understanding in AI's Large Language Models
Melanie Mitchell
D. Krakauer
ELM
74
203
0
14 Oct 2022
Visual Reinforcement Learning with Self-Supervised 3D Representations
Yanjie Ze
Nicklas Hansen
Yinbo Chen
Mohit Jain
Xiaolong Wang
SSL
34
49
0
13 Oct 2022
Efficient circuit implementation for coined quantum walks on binary trees and application to reinforcement learning
Thomas Mullor
David Vigouroux
Louis Bethune
21
0
0
13 Oct 2022
Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery
Félix Chalumeau
Raphael Boige
Bryan Lim
Valentin Macé
Maxime Allard
Arthur Flajolet
Antoine Cully
Thomas Pierrot
31
21
0
06 Oct 2022
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Oren Neumann
C. Gros
32
26
0
29 Sep 2022
Design of experiments for the calibration of history-dependent models via deep reinforcement learning and an enhanced Kalman filter
Ruben Villarreal
Nikolaos N. Vlassis
Nhon N. Phan
Tommie A. Catanach
Reese E. Jones
N. Trask
S. Kramer
WaiChing Sun
OffRL
32
11
0
27 Sep 2022
Graph Value Iteration
Dieqiao Feng
Carla P. Gomes
B. Selman
11
0
0
20 Sep 2022
Parallel Monte Carlo Tree Search with Batched Rigid-body Simulations for Speeding up Long-Horizon Episodic Robot Planning
Baichuan Huang
Abdeslam Boularias
Jingjin Yu
20
9
0
14 Jul 2022
Stabilizing Off-Policy Deep Reinforcement Learning from Pixels
Edoardo Cetin
Philip J. Ball
Steve Roberts
Oya Celiktutan
40
36
0
03 Jul 2022
EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine
Jiayi Weng
Min Lin
Shengyi Huang
Bo Liu
Denys Makoviichuk
...
Yufan Song
Ting Luo
Yukun Jiang
Zhongwen Xu
Shuicheng Yan
MoE
19
61
0
21 Jun 2022
A Survey on Model-based Reinforcement Learning
Fan Luo
Tian Xu
Hang Lai
Xiong-Hui Chen
Weinan Zhang
Yang Yu
OffRL
LRM
53
101
0
19 Jun 2022
Previous
1
2
3
4
5
6
Next