Low-Variance and Zero-Variance Baselines for Extensive-Form Games

Low-Variance and Zero-Variance Baselines for Extensive-Form Games

22 July 2019

Michael Bowling

Papers citing "Low-Variance and Zero-Variance Baselines for Extensive-Form Games"

15 / 15 papers shown

Title
A Survey on Self-play Methods in Reinforcement Learning Chao Yu Zelai Xu Chengdong Ma Chao Yu Weijuan Tu ... Deheng Ye Wenbo Ding Yaodong Yang Yu Wang Yu Wang SyDa SSL OnRL 77 8 0 02 Aug 2024
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments S. Srinivasan Marc Lanctot V. Zambaldi Julien Perolat K. Tuyls Rémi Munos Michael Bowling 30 148 0 21 Oct 2018
Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines Martin Schmid Neil Burch Marc Lanctot Matej Moravcík Rudolf Kadlec Michael Bowling 105 64 0 09 Sep 2018
Depth-Limited Solving for Imperfect-Information Games Noam Brown Tuomas Sandholm Brandon Amos 47 80 0 21 May 2018
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines Cathy Wu Aravind Rajeswaran Yan Duan Vikash Kumar Alexandre M. Bayen Sham Kakade Igor Mordatch Pieter Abbeel OffRL 37 151 0 20 Mar 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning George Tucker Surya Bhupatiraju S. Gu Richard Turner Zoubin Ghahramani Sergey Levine OffRL 49 127 0 27 Feb 2018
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments Ryan J. Lowe Yi Wu Aviv Tamar J. Harb Pieter Abbeel Igor Mordatch 113 4,441 0 07 Jun 2017
Counterfactual Multi-Agent Policy Gradients Jakob N. Foerster Gregory Farquhar Triantafyllos Afouras Nantas Nardelli Shimon Whiteson 49 2,053 0 24 May 2017
Safe and Nested Subgame Solving for Imperfect-Information Games Noam Brown Tuomas Sandholm 41 182 0 08 May 2017
DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Matej Moravcík Martin Schmid Neil Burch Viliam Lisý Dustin Morrill Nolan Bard Trevor Davis Kevin Waugh Michael Bradley Johanson Michael Bowling BDL 61 904 0 06 Jan 2017
AIVAT: A New Variance Reduction Technique for Agent Evaluation in Imperfect Information Games Neil Burch Martin Schmid Matej Moravcík Michael Bowling 22 21 0 20 Dec 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning Philip S. Thomas Emma Brunskill OffRL 198 573 0 04 Apr 2016
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning Nan Jiang Lihong Li OffRL 125 621 0 11 Nov 2015
High-Dimensional Continuous Control Using Generalized Advantage Estimation John Schulman Philipp Moritz Sergey Levine Michael I. Jordan Pieter Abbeel OffRL 38 3,368 0 08 Jun 2015
Bayes' Bluff: Opponent Modelling in Poker F. Southey Michael Bowling Bryce Larson Carmelo Piccione Neil Burch Darse Billings D. C. Rayner 90 260 0 04 Jul 2012