ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.14195
24
4

Exponential Family Model-Based Reinforcement Learning via Score Matching

28 December 2021
Gen Li
Junbo Li
Anmol Kabra
Nathan Srebro
Zhaoran Wang
Zhuoran Yang
ArXivPDFHTML
Abstract

We propose an optimistic model-based algorithm, dubbed SMRL, for finite-horizon episodic reinforcement learning (RL) when the transition model is specified by exponential family distributions with ddd parameters and the reward is bounded and known. SMRL uses score matching, an unnormalized density estimation technique that enables efficient estimation of the model parameter by ridge regression. Under standard regularity assumptions, SMRL achieves O~(dH3T)\tilde O(d\sqrt{H^3T})O~(dH3T​) online regret, where HHH is the length of each episode and TTT is the total number of interactions (ignoring polynomial dependence on structural scale parameters).

View on arXiv
Comments on this paper