ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.04933
  4. Cited By
Reinforcement Learning Tutor Better Supported Lower Performers in a Math
  Task

Reinforcement Learning Tutor Better Supported Lower Performers in a Math Task

11 April 2023
S. Ruan
Allen Nie
William Steenbergen
Jiayu He
JQ Zhang
Meng Guo
Yao Liu
Kyle Dang Nguyen
Catherine Y Wang
Rui Ying
James A. Landay
Emma Brunskill
ArXivPDFHTML

Papers citing "Reinforcement Learning Tutor Better Supported Lower Performers in a Math Task"

5 / 5 papers shown
Title
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs
  Hierarchically
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically
Kefan Dong
Arvind V. Mahankali
Tengyu Ma
ReLM
LRM
30
6
0
04 Nov 2024
Off-Policy Selection for Initiating Human-Centric Experimental Design
Off-Policy Selection for Initiating Human-Centric Experimental Design
Ge Gao
Xi Yang
Qitong Gao
Song Ju
Miroslav Pajic
Min Chi
OffRL
36
0
0
26 Oct 2024
OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates
  of Multiple Estimators
OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators
Allen Nie
Yash Chandak
Christina J. Yuan
Anirudhan Badrinath
Yannis Flet-Berliac
Emma Brunskil
OffRL
50
0
0
27 May 2024
Is Offline Decision Making Possible with Only Few Samples? Reliable
  Decisions in Data-Starved Bandits via Trust Region Enhancement
Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Ruiqi Zhang
Yuexiang Zhai
Andrea Zanette
48
0
0
24 Feb 2024
Off-Policy Evaluation for Human Feedback
Off-Policy Evaluation for Human Feedback
Qitong Gao
Ge Gao
Juncheng Dong
Vahid Tarokh
Min Chi
Miroslav Pajic
OffRL
29
5
0
11 Oct 2023
1