Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking

6 June 2024

Matthias Althoff

Papers citing "Excluding the Irrelevant: Focusing Reinforcement Learning through Continuous Action Masking"

2 / 2 papers shown

Title
Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and Benchmarking Hanna Krasowski Jakob Thumm Marlon Müller Lukas Schäfer Xiao Wang Matthias Althoff 88 19 0 13 May 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 339 12,003 0 04 Mar 2022