ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17714
  4. Cited By
PPO-BR: Dual-Signal Entropy-Reward Adaptation for Trust Region Policy Optimization

PPO-BR: Dual-Signal Entropy-Reward Adaptation for Trust Region Policy Optimization

23 May 2025
Ben Rahman
ArXiv (abs)PDFHTML

Papers citing "PPO-BR: Dual-Signal Entropy-Reward Adaptation for Trust Region Policy Optimization"

9 / 9 papers shown
Title
Context-Aware Semantic Segmentation: Enhancing Pixel-Level Understanding with Large Language Models for Advanced Vision Applications
Context-Aware Semantic Segmentation: Enhancing Pixel-Level Understanding with Large Language Models for Advanced Vision Applications
Ben Rahman
VLM
82
3
0
25 Mar 2025
Understanding the impact of entropy on policy optimization
Understanding the impact of entropy on policy optimization
Zafarali Ahmed
Nicolas Le Roux
Mohammad Norouzi
Dale Schuurmans
75
238
0
27 Nov 2018
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward
  Bias in Adversarial Imitation Learning
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
Ilya Kostrikov
Kumar Krishna Agrawal
Debidatta Dwibedi
Sergey Levine
Jonathan Tompson
108
259
0
09 Sep 2018
A Study on Overfitting in Deep Reinforcement Learning
A Study on Overfitting in Deep Reinforcement Learning
Chiyuan Zhang
Oriol Vinyals
Rémi Munos
Samy Bengio
OffRLOnRL
59
391
0
18 Apr 2018
Noisy Networks for Exploration
Noisy Networks for Exploration
Meire Fortunato
M. G. Azar
Bilal Piot
Jacob Menick
Ian Osband
...
Rémi Munos
Demis Hassabis
Olivier Pietquin
Charles Blundell
Shane Legg
97
897
0
30 Jun 2017
Constrained Policy Optimization
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
134
1,335
0
30 May 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRLVLM
270
1,545
0
25 Jan 2017
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
135
3,442
0
08 Jun 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
283
6,807
0
19 Feb 2015
1