Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.16878
Cited By
Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model
22 December 2024
Songjun Tu
Jingbo Sun
Qichao Zhang
Xiangyuan Lan
Dongbin Zhao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model"
3 / 3 papers shown
Title
ReasonPlan: Unified Scene Prediction and Decision Reasoning for Closed-loop Autonomous Driving
Xueyi Liu
Zuodong Zhong
Yuxin Guo
Yun-Fu Liu
Zhiguo Su
...
Yinfeng Gao
Yupeng Zheng
Qiao Lin
Huiyong Chen
Dongbin Zhao
LRM
7
0
0
26 May 2025
AED: Automatic Discovery of Effective and Diverse Vulnerabilities for Autonomous Driving Policy with Large Language Models
Le Qiu
Zelai Xu
Qixin Tan
Wenhao Tang
Chao Yu
Yu Wang
AAML
71
0
0
24 Mar 2025
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
63
2
0
13 Jun 2024
1