Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy
Actor-Critic

Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic

5 June 2023

Papers citing "Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic"

10 / 10 papers shown

Title
Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control Michal Nauman M. Ostaszewski Krzysztof Jankowski Piotr Milo's Marek Cygan OffRL 47 16 0 25 May 2024
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning Michal Nauman Michal Bortkiewicz Piotr Milo's Tomasz Trzciñski M. Ostaszewski Marek Cygan OffRL 35 17 0 01 Mar 2024
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization Tianying Ji Yongyuan Liang Yan Zeng Yu-Juan Luo Guowei Xu Jiawei Guo Ruijie Zheng Furong Huang Gang Hua Huazhe Xu CML 50 11 0 22 Feb 2024
H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps Haoyi Niu Tianying Ji Bingqi Liu Haocheng Zhao Xiangyu Zhu Jianying Zheng Pengfei Huang Guyue Zhou Jianming Hu Xianyuan Zhan OffRL OnRL AI4CE 29 7 0 22 Sep 2023
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective Raj Ghugare Homanga Bharadhwaj Benjamin Eysenbach Sergey Levine Ruslan Salakhutdinov OffRL 48 25 0 18 Sep 2022
Optimistic Curiosity Exploration and Conservative Exploitation with Linear Reward Shaping Hao Sun Lei Han Rui Yang Xiaoteng Ma Jian Guo Bolei Zhou OffRL OnRL 38 10 0 15 Sep 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 369 12,081 0 04 Mar 2022
Offline Reinforcement Learning with Implicit Q-Learning Ilya Kostrikov Ashvin Nair Sergey Levine OffRL 214 848 0 12 Oct 2021
Learning Pessimism for Robust and Efficient Off-Policy Reinforcement Learning Edoardo Cetin Oya Celiktutan OffRL 42 17 0 07 Oct 2021
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics Arsenii Kuznetsov Pavel Shvechikov Alexander Grishin Dmitry Vetrov 136 187 0 08 May 2020