Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.04168
Cited By
v1
v2
v3
v4
v5 (latest)
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature
8 February 2021
Kefan Dong
Jiaqi Yang
Tengyu Ma
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature"
38 / 38 papers shown
Title
Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning
Subhojyoti Mukherjee
Josiah P. Hanna
Qiaomin Xie
Robert Nowak
212
2
0
07 Jun 2024
Bilinear Classes: A Structural Framework for Provable Generalization in RL
S. Du
Sham Kakade
Jason D. Lee
Shachar Lovett
G. Mahajan
Wen Sun
Ruosong Wang
OffRL
171
191
0
19 Mar 2021
Adapting to Misspecification in Contextual Bandits with Offline Regression Oracles
Sanath Kumar Krishnamurthy
Vitor Hadad
Susan Athey
OffRL
122
23
0
26 Feb 2021
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
82
18
0
09 Nov 2020
Efficient Planning in Large MDPs with Weak Linear Function Approximation
R. Shariff
Csaba Szepesvári
68
22
0
13 Jul 2020
Information Theoretic Regret Bounds for Online Nonlinear Control
Sham Kakade
A. Krishnamurthy
Kendall Lowrey
Motoya Ohnishi
Wen Sun
67
119
0
22 Jun 2020
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Alekh Agarwal
Sham Kakade
A. Krishnamurthy
Wen Sun
OffRL
168
227
0
18 Jun 2020
A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning
Sijia Liu
Pin-Yu Chen
B. Kailkhura
Gaoyuan Zhang
A. Hero III
P. Varshney
72
235
0
11 Jun 2020
Model-Based Reinforcement Learning with Value-Targeted Regression
Alex Ayoub
Zeyu Jia
Csaba Szepesvári
Mengdi Wang
Lin F. Yang
OffRL
94
305
0
01 Jun 2020
Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension
Ruosong Wang
Ruslan Salakhutdinov
Lin F. Yang
65
55
0
21 May 2020
Model-Augmented Actor-Critic: Backpropagating through Paths
I. Clavera
Yao Fu
Pieter Abbeel
77
88
0
16 May 2020
Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability
D. Simchi-Levi
Yunzong Xu
OffRL
379
111
0
28 Mar 2020
Improved Optimistic Algorithms for Logistic Bandits
Louis Faury
Marc Abeille
Clément Calauzènes
Olivier Fercoq
87
95
0
18 Feb 2020
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
Dylan J. Foster
Alexander Rakhlin
369
212
0
12 Feb 2020
Dota 2 with Large Scale Deep Reinforcement Learning
OpenAI OpenAI
:
Christopher Berner
Greg Brockman
Brooke Chan
...
Szymon Sidor
Ilya Sutskever
Jie Tang
Filip Wolski
Susan Zhang
GNN
VLM
CLL
AI4CE
LRM
169
1,836
0
13 Dec 2019
Provably Efficient Exploration in Policy Optimization
Qi Cai
Zhuoran Yang
Chi Jin
Zhaoran Wang
66
283
0
12 Dec 2019
Optimism in Reinforcement Learning with Generalized Linear Function Approximation
Yining Wang
Ruosong Wang
S. Du
A. Krishnamurthy
182
137
0
09 Dec 2019
Dream to Control: Learning Behaviors by Latent Imagination
Danijar Hafner
Timothy Lillicrap
Jimmy Ba
Mohammad Norouzi
VLM
126
1,371
0
03 Dec 2019
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Dipendra Kumar Misra
Mikael Henaff
A. Krishnamurthy
John Langford
79
151
0
13 Nov 2019
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
Alekh Agarwal
Sham Kakade
Jason D. Lee
G. Mahajan
69
321
0
01 Aug 2019
Provably Efficient Reinforcement Learning with Linear Function Approximation
Chi Jin
Zhuoran Yang
Zhaoran Wang
Michael I. Jordan
98
560
0
11 Jul 2019
When to Trust Your Model: Model-Based Policy Optimization
Michael Janner
Justin Fu
Marvin Zhang
Sergey Levine
OffRL
102
956
0
19 Jun 2019
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Lin F. Yang
Mengdi Wang
OffRL
GP
64
288
0
24 May 2019
Provably efficient RL with Rich Observations via Latent State Decoding
S. Du
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
Miroslav Dudík
John Langford
OffRL
74
230
0
25 Jan 2019
Learning Latent Dynamics for Planning from Pixels
Danijar Hafner
Timothy Lillicrap
Ian S. Fischer
Ruben Villegas
David R Ha
Honglak Lee
James Davidson
BDL
88
1,446
0
12 Nov 2018
On Oracle-Efficient PAC RL with Rich Observations
Christoph Dann
Nan Jiang
A. Krishnamurthy
Alekh Agarwal
John Langford
Robert Schapire
49
98
0
01 Mar 2018
Applications of Deep Learning and Reinforcement Learning to Biological Data
M. S. M. Mahmud
M. S. Kaiser
Amir Hussain
S. Vassanelli
OffRL
AI4CE
67
645
0
10 Nov 2017
No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis
Rong Ge
Chi Jin
Yi Zheng
132
436
0
03 Apr 2017
Input Convex Neural Networks
Brandon Amos
Lei Xu
J. Zico Kolter
280
624
0
22 Sep 2016
Matrix Completion has No Spurious Local Minimum
Rong Ge
Jason D. Lee
Tengyu Ma
114
599
0
24 May 2016
Empirical Evaluation of Rectified Activations in Convolutional Network
Bing Xu
Naiyan Wang
Tianqi Chen
Mu Li
140
2,913
0
05 May 2015
Escaping From Saddle Points --- Online Stochastic Gradient for Tensor Decomposition
Rong Ge
Furong Huang
Chi Jin
Yang Yuan
143
1,059
0
06 Mar 2015
Optimal rates for zero-order convex optimization: the power of two function evaluations
John C. Duchi
Michael I. Jordan
Martin J. Wainwright
Andre Wibisono
82
489
0
07 Dec 2013
Finite-Time Analysis of Kernelised Contextual Bandits
Michal Valko
N. Korda
Rémi Munos
I. Flaounas
N. Cristianini
185
275
0
26 Sep 2013
Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit
Alexandra Carpentier
Rémi Munos
82
102
0
18 May 2012
A tail inequality for quadratic forms of subgaussian random vectors
Daniel J. Hsu
Sham Kakade
Tong Zhang
139
422
0
13 Oct 2011
Efficient Learning of Generalized Linear and Single Index Models with Isotonic Regression
Sham Kakade
Adam Tauman Kalai
Varun Kanade
Ohad Shamir
196
180
0
11 Apr 2011
Online Learning via Sequential Complexities
Alexander Rakhlin
Karthik Sridharan
Ambuj Tewari
117
104
0
06 Jun 2010
1