ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.09563
  4. Cited By
e-COP : Episodic Constrained Optimization of Policies

e-COP : Episodic Constrained Optimization of Policies

13 June 2024
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Sahil Singla
    OffRL
ArXivPDFHTML

Papers citing "e-COP : Episodic Constrained Optimization of Policies"

8 / 8 papers shown
Title
Online Bandit Learning with Offline Preference Data for Improved RLHF
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
129
2
0
13 Jun 2024
GPT-4 Technical Report
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
890
13,788
0
15 Mar 2023
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Akhil Agnihotri
R. Jain
Haipeng Luo
49
2
0
02 Feb 2023
Responsive Safety in Reinforcement Learning by PID Lagrangian Methods
Responsive Safety in Reinforcement Learning by PID Lagrangian Methods
Adam Stooke
Joshua Achiam
Pieter Abbeel
55
291
0
08 Jul 2020
IPO: Interior-point Policy Optimization under Constraints
IPO: Interior-point Policy Optimization under Constraints
Yongshuai Liu
J. Ding
Xin Liu
50
178
0
21 Oct 2019
Solving Rubik's Cube with a Robot Hand
Solving Rubik's Cube with a Robot Hand
OpenAI
Ilge Akkaya
Marcin Andrychowicz
Maciek Chociej
Ma-teusz Litwin
...
Peter Welinder
Lilian Weng
Qiming Yuan
Wojciech Zaremba
Lei Zhang
ODL
84
1,215
0
16 Oct 2019
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
285
18,685
0
20 Jul 2017
Constrained Policy Optimization
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
106
1,313
0
30 May 2017
1