Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.17714
Cited By
PPO-BR: Dual-Signal Entropy-Reward Adaptation for Trust Region Policy Optimization
23 May 2025
Ben Rahman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"PPO-BR: Dual-Signal Entropy-Reward Adaptation for Trust Region Policy Optimization"
9 / 9 papers shown
Title
Context-Aware Semantic Segmentation: Enhancing Pixel-Level Understanding with Large Language Models for Advanced Vision Applications
Ben Rahman
VLM
82
3
0
25 Mar 2025
Understanding the impact of entropy on policy optimization
Zafarali Ahmed
Nicolas Le Roux
Mohammad Norouzi
Dale Schuurmans
75
238
0
27 Nov 2018
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
Ilya Kostrikov
Kumar Krishna Agrawal
Debidatta Dwibedi
Sergey Levine
Jonathan Tompson
108
259
0
09 Sep 2018
A Study on Overfitting in Deep Reinforcement Learning
Chiyuan Zhang
Oriol Vinyals
Rémi Munos
Samy Bengio
OffRL
OnRL
59
391
0
18 Apr 2018
Noisy Networks for Exploration
Meire Fortunato
M. G. Azar
Bilal Piot
Jacob Menick
Ian Osband
...
Rémi Munos
Demis Hassabis
Olivier Pietquin
Charles Blundell
Shane Legg
97
897
0
30 Jun 2017
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
134
1,335
0
30 May 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
270
1,545
0
25 Jan 2017
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
135
3,442
0
08 Jun 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
283
6,807
0
19 Feb 2015
1