Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.09563
Cited By
e-COP : Episodic Constrained Optimization of Policies
13 June 2024
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Sahil Singla
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"e-COP : Episodic Constrained Optimization of Policies"
8 / 8 papers shown
Title
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
129
2
0
13 Jun 2024
GPT-4 Technical Report
OpenAI OpenAI
OpenAI Josh Achiam
Steven Adler
Sandhini Agarwal
Lama Ahmad
...
Shengjia Zhao
Tianhao Zheng
Juntang Zhuang
William Zhuk
Barret Zoph
LLMAG
MLLM
890
13,788
0
15 Mar 2023
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Akhil Agnihotri
R. Jain
Haipeng Luo
49
2
0
02 Feb 2023
Responsive Safety in Reinforcement Learning by PID Lagrangian Methods
Adam Stooke
Joshua Achiam
Pieter Abbeel
55
291
0
08 Jul 2020
IPO: Interior-point Policy Optimization under Constraints
Yongshuai Liu
J. Ding
Xin Liu
50
178
0
21 Oct 2019
Solving Rubik's Cube with a Robot Hand
OpenAI
Ilge Akkaya
Marcin Andrychowicz
Maciek Chociej
Ma-teusz Litwin
...
Peter Welinder
Lilian Weng
Qiming Yuan
Wojciech Zaremba
Lei Zhang
ODL
84
1,215
0
16 Oct 2019
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
285
18,685
0
20 Jul 2017
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
106
1,313
0
30 May 2017
1