Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.07148
Cited By
Semi-supervised Batch Learning From Logged Data
15 September 2022
Gholamali Aminian
Armin Behnamnia
R. Vega
Laura Toni
Chengchun Shi
Hamid R. Rabiee
Omar Rivasplata
Miguel R. D. Rodrigues
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Semi-supervised Batch Learning From Logged Data"
15 / 15 papers shown
Title
Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality
Ying Jin
Zhimei Ren
Zhuoran Yang
Zhaoran Wang
OffRL
81
26
0
19 Dec 2022
PAC-Bayesian Offline Contextual Bandits With Guarantees
Otmane Sakhi
Pierre Alquier
Nicolas Chopin
OffRL
99
13
0
24 Oct 2022
A Survey on Deep Semi-supervised Learning
Xiangli Yang
Zixing Song
Irwin King
Zenglin Xu
62
576
0
28 Feb 2021
Semi-supervised reward learning for offline reinforcement learning
Ksenia Konyushkova
Konrad Zolna
Y. Aytar
Alexander Novikov
Scott E. Reed
Serkan Cabi
Nando de Freitas
SSL
OffRL
93
23
0
12 Dec 2020
Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar
Aurick Zhou
George Tucker
Sergey Levine
OffRL
OnRL
99
1,780
0
08 Jun 2020
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
Natasha Jaques
Asma Ghandeharioun
J. Shen
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
78
338
0
30 Jun 2019
Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy
Yuan Xie
Boyi Liu
Qiang Liu
Zhaoran Wang
Yuanshuo Zhou
Jian-wei Peng
OffRL
34
19
0
01 Aug 2018
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms
Han Xiao
Kashif Rasul
Roland Vollgraf
173
8,807
0
25 Aug 2017
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
91
1,313
0
30 May 2017
Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes
Ahmed Alaa
M. Schaar
CML
118
300
0
10 Apr 2017
f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
Sebastian Nowozin
Botond Cseke
Ryota Tomioka
GAN
93
1,648
0
02 Jun 2016
Learning Representations for Counterfactual Inference
Fredrik D. Johansson
Uri Shalit
David Sontag
CML
OOD
BDL
269
726
0
12 May 2016
Explore no more: Improved high-probability regret bounds for non-stochastic bandits
Gergely Neu
181
182
0
10 Jun 2015
Doubly Robust Policy Evaluation and Optimization
Miroslav Dudík
D. Erhan
John Langford
Lihong Li
OffRL
120
285
0
10 Mar 2015
Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms
Lihong Li
Wei Chu
John Langford
Xuanhui Wang
OffRL
150
574
0
31 Mar 2010
1