ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.01624
  4. Cited By
Off-Policy Evaluation via Off-Policy Classification

Off-Policy Evaluation via Off-Policy Classification

4 June 2019
A. Irpan
Kanishka Rao
Konstantinos Bousmalis
Chris Harris
Julian Ibarz
Sergey Levine
    OffRL
ArXivPDFHTML

Papers citing "Off-Policy Evaluation via Off-Policy Classification"

17 / 17 papers shown
Title
Few Dimensions are Enough: Fine-tuning BERT with Selected Dimensions Revealed Its Redundant Nature
Few Dimensions are Enough: Fine-tuning BERT with Selected Dimensions Revealed Its Redundant Nature
Shion Fukuhata
Yoshinobu Kano
29
0
0
07 Apr 2025
Practical Performative Policy Learning with Strategic Agents
Practical Performative Policy Learning with Strategic Agents
Qianyi Chen
Ying Chen
Bo Li
110
0
0
02 Dec 2024
Rescue Conversations from Dead-ends: Efficient Exploration for
  Task-oriented Dialogue Policy Optimization
Rescue Conversations from Dead-ends: Efficient Exploration for Task-oriented Dialogue Policy Optimization
Yangyang Zhao
Zhenyu Wang
Mehdi Dastani
Shihan Wang
24
0
0
05 May 2023
Towards Real-World Applications of Personalized Anesthesia Using Policy
  Constraint Q Learning for Propofol Infusion Control
Towards Real-World Applications of Personalized Anesthesia Using Policy Constraint Q Learning for Propofol Infusion Control
Xiuding Cai
Jiao Chen
Yaoyao Zhu
Beiming Wang
Yu Yao
OffRL
41
5
0
17 Mar 2023
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
Ge Gao
Song Ju
Markel Sanz Ausin
Min Chi
OffRL
29
8
0
18 Feb 2023
Offline Policy Comparison with Confidence: Benchmarks and Baselines
Offline Policy Comparison with Confidence: Benchmarks and Baselines
Anurag Koul
Mariano Phielipp
Alan Fern
OffRL
28
0
0
22 May 2022
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement
  for Value Error
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error
Scott Fujimoto
David Meger
Doina Precup
Ofir Nachum
S. Gu
30
32
0
28 Jan 2022
Validate on Sim, Detect on Real -- Model Selection for Domain
  Randomization
Validate on Sim, Detect on Real -- Model Selection for Domain Randomization
Gal Leibovich
Guy Jacob
Shadi Endrawis
Gal Novik
Aviv Tamar
30
7
0
01 Nov 2021
Medical Dead-ends and Learning to Identify High-risk States and
  Treatments
Medical Dead-ends and Learning to Identify High-risk States and Treatments
Mehdi Fatemi
Taylor W. Killian
J. Subramanian
Marzyeh Ghassemi
OffRL
36
37
0
08 Oct 2021
Model Selection for Offline Reinforcement Learning: Practical
  Considerations for Healthcare Settings
Model Selection for Offline Reinforcement Learning: Practical Considerations for Healthcare Settings
Shengpu Tang
Jenna Wiens
OffRL
26
78
0
23 Jul 2021
Supervised Off-Policy Ranking
Supervised Off-Policy Ranking
Yue Jin
Yue Zhang
Tao Qin
Xudong Zhang
Jian Yuan
Houqiang Li
Tie-Yan Liu
OffRL
32
5
0
03 Jul 2021
Benchmarks for Deep Off-Policy Evaluation
Benchmarks for Deep Off-Policy Evaluation
Justin Fu
Mohammad Norouzi
Ofir Nachum
George Tucker
Ziyun Wang
...
Yutian Chen
Aviral Kumar
Cosmin Paduraru
Sergey Levine
T. Paine
ELM
OffRL
35
100
0
30 Mar 2021
Replacing Rewards with Examples: Example-Based Policy Search via
  Recursive Classification
Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification
Benjamin Eysenbach
Sergey Levine
Ruslan Salakhutdinov
OffRL
39
50
0
23 Mar 2021
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible
  Off-Policy Evaluation
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
24
73
0
17 Aug 2020
Hyperparameter Selection for Offline Reinforcement Learning
Hyperparameter Selection for Offline Reinforcement Learning
T. Paine
Cosmin Paduraru
Andrea Michi
Çağlar Gülçehre
Konrad Zolna
Alexander Novikov
Ziyun Wang
Nando de Freitas
GP
OffRL
49
146
0
17 Jul 2020
Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic
  Reinforcement Learning
Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning
Ryan Julian
Benjamin Swanson
Gaurav Sukhatme
Sergey Levine
Chelsea Finn
Karol Hausman
OnRL
CLL
33
43
0
21 Apr 2020
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement
  Learning
BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Xinyue Chen
Zijian Zhou
Zhilin Wang
Che Wang
Yanqiu Wu
Keith Ross
OffRL
33
121
0
27 Oct 2019
1