ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.01360
  4. Cited By
Supervised Off-Policy Ranking

Supervised Off-Policy Ranking

3 July 2021
Yue Jin
Yue Zhang
Tao Qin
Xudong Zhang
Jian Yuan
Houqiang Li
Tie-Yan Liu
    OffRL
ArXivPDFHTML

Papers citing "Supervised Off-Policy Ranking"

16 / 16 papers shown
Title
Minimax Model Learning
Minimax Model Learning
Cameron Voloshin
Nan Jiang
Yisong Yue
OffRL
70
18
0
02 Mar 2021
Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle
  Routing Problems
Multi-Decoder Attention Model with Embedding Glimpse for Solving Vehicle Routing Problems
Liang Xin
Wen Song
Zhiguang Cao
Jie Zhang
47
150
0
19 Dec 2020
Statistical Bootstrapping for Uncertainty Estimation in Off-Policy
  Evaluation
Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation
Ilya Kostrikov
Ofir Nachum
OffRL
22
30
0
27 Jul 2020
Off-Policy Evaluation via the Regularized Lagrangian
Off-Policy Evaluation via the Regularized Lagrangian
Mengjiao Yang
Ofir Nachum
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
14
115
0
07 Jul 2020
Critic Regularized Regression
Critic Regularized Regression
Ziyun Wang
Alexander Novikov
Konrad Zolna
Jost Tobias Springenberg
Scott E. Reed
...
Noah Y. Siegel
J. Merel
Çağlar Gülçehre
N. Heess
Nando de Freitas
OffRL
117
320
0
26 Jun 2020
Conservative Q-Learning for Offline Reinforcement Learning
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
80
1,780
0
08 Jun 2020
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
Justin Fu
Aviral Kumar
Ofir Nachum
George Tucker
Sergey Levine
GP
OffRL
167
1,338
0
15 Apr 2020
Benchmarking Batch Deep Reinforcement Learning Algorithms
Benchmarking Batch Deep Reinforcement Learning Algorithms
Shih-Han Chou
Wen-Yen Chang
W. Hsu
Jianlong Fu
OffRL
36
182
0
03 Oct 2019
When to Trust Your Model: Model-Based Policy Optimization
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
48
939
0
19 Jun 2019
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
66
332
0
10 Jun 2019
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
Aviral Kumar
Justin Fu
George Tucker
Sergey Levine
OffRL
OnRL
65
1,044
0
03 Jun 2019
Attention, Learn to Solve Routing Problems!
Attention, Learn to Solve Routing Problems!
W. Kool
H. V. Hoof
Max Welling
65
1,193
0
22 Mar 2018
Neural Combinatorial Optimization with Reinforcement Learning
Neural Combinatorial Optimization with Reinforcement Learning
Irwan Bello
Hieu H. Pham
Quoc V. Le
Mohammad Norouzi
Samy Bengio
99
1,472
0
29 Nov 2016
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
Rémi Munos
T. Stepleton
Anna Harutyunyan
Marc G. Bellemare
OffRL
105
611
0
08 Jun 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
150
573
0
04 Apr 2016
Pointer Networks
Pointer Networks
Oriol Vinyals
Meire Fortunato
Navdeep Jaitly
72
3,036
0
09 Jun 2015
1