ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1905.05809
  4. Cited By
Learning Policies from Self-Play with Policy Gradients and MCTS Value
  Estimates

Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates

14 May 2019
Dennis J. N. J. Soemers
Éric Piette
Matthew Stephenson
C. Browne
ArXivPDFHTML

Papers citing "Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates"

4 / 4 papers shown
Title
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player
  Zero-Sum Games
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games
Yang Li
Kun Xiong
Yingping Zhang
Jiangcheng Zhu
Stephen Marcus McAleer
Wei Pan
Jun Wang
Zonghong Dai
Yaodong Yang
44
2
0
09 Aug 2023
Monte Carlo Tree Search: A Review of Recent Modifications and
  Applications
Monte Carlo Tree Search: A Review of Recent Modifications and Applications
M. Świechowski
Konrad Godlewski
B. Sawicki
Jacek Mańdziuk
46
252
0
08 Mar 2021
Foundations of Digital Archæoludology
Foundations of Digital Archæoludology
C. Browne
Dennis J. N. J. Soemers
Éric Piette
Matthew Stephenson
Michael Conrad
...
Abdallah Saffidine
Ulrich Schädler
Jorge Nuno Silva
A. Voogt
M. Winands
AI4CE
22
8
0
31 May 2019
Off-Policy Actor-Critic
Off-Policy Actor-Critic
T. Degris
Martha White
R. Sutton
OffRL
CML
163
220
0
22 May 2012
1