ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.05079
  4. Cited By
Safe Policy Improvement with Soft Baseline Bootstrapping

Safe Policy Improvement with Soft Baseline Bootstrapping

11 July 2019
Kimia Nadjahi
Romain Laroche
Rémi Tachet des Combes
    OffRL
ArXivPDFHTML

Papers citing "Safe Policy Improvement with Soft Baseline Bootstrapping"

12 / 12 papers shown
Title
Safe Policy Improvement for POMDPs via Finite-State Controllers
Safe Policy Improvement for POMDPs via Finite-State Controllers
T. D. Simão
Marnix Suilen
N. Jansen
OffRL
40
9
0
12 Jan 2023
Incorporating Explicit Uncertainty Estimates into Deep Offline
  Reinforcement Learning
Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning
David Brandfonbrener
Rémi Tachet des Combes
Romain Laroche
OffRL
46
5
0
02 Jun 2022
Non-Markovian policies occupancy measures
Non-Markovian policies occupancy measures
Romain Laroche
Rémi Tachet des Combes
Jacob Buckman
OffRL
44
1
0
27 May 2022
User-Interactive Offline Reinforcement Learning
User-Interactive Offline Reinforcement Learning
Phillip Swazinna
Steffen Udluft
Thomas Runkler
OffRL
41
11
0
21 May 2022
When Should We Prefer Offline Reinforcement Learning Over Behavioral
  Cloning?
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Aviral Kumar
Joey Hong
Anika Singh
Sergey Levine
OffRL
50
79
0
12 Apr 2022
Beyond the Policy Gradient Theorem for Efficient Policy Updates in
  Actor-Critic Algorithms
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms
Romain Laroche
Rémi Tachet des Combes
53
2
0
15 Feb 2022
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates
Romain Laroche
Rémi Tachet des Combes
51
8
0
29 Sep 2021
Offline RL Without Off-Policy Evaluation
Offline RL Without Off-Policy Evaluation
David Brandfonbrener
William F. Whitney
Rajesh Ranganath
Joan Bruna
OffRL
47
163
0
16 Jun 2021
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale
  of Pessimism
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
Paria Rashidinejad
Banghua Zhu
Cong Ma
Jiantao Jiao
Stuart J. Russell
OffRL
44
281
0
22 Mar 2021
Offline Reinforcement Learning with Pseudometric Learning
Offline Reinforcement Learning with Pseudometric Learning
Robert Dadashi
Shideh Rezaeifar
Nino Vieillard
Léonard Hussenot
Olivier Pietquin
Matthieu Geist
OffRL
39
40
0
02 Mar 2021
The Importance of Pessimism in Fixed-Dataset Policy Optimization
The Importance of Pessimism in Fixed-Dataset Policy Optimization
Jacob Buckman
Carles Gelada
Marc G. Bellemare
OffRL
47
137
0
15 Sep 2020
Off-policy Bandits with Deficient Support
Off-policy Bandits with Deficient Support
Noveen Sachdeva
Yi-Hsun Su
Thorsten Joachims
OffRL
38
75
0
16 Jun 2020
1