ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.06098
  4. Cited By
Policy Optimization via Importance Sampling

Policy Optimization via Importance Sampling

17 September 2018
Alberto Maria Metelli
Matteo Papini
Francesco Faccio
Marcello Restelli
    OffRL
ArXivPDFHTML

Papers citing "Policy Optimization via Importance Sampling"

23 / 23 papers shown
Title
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training
Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training
Yiran Chen
Hao Peng
Tong Zhang
Heng Ji
VLM
32
0
0
13 May 2025
Enhancing Diversity in Parallel Agents: A Maximum State Entropy Exploration Story
Enhancing Diversity in Parallel Agents: A Maximum State Entropy Exploration Story
Vincenzo De Paola
Riccardo Zamboni
Mirco Mutti
Marcello Restelli
21
0
0
02 May 2025
Compatible Gradient Approximations for Actor-Critic Algorithms
Compatible Gradient Approximations for Actor-Critic Algorithms
Baturay Saglam
Dionysis Kalogerias
37
0
0
02 Sep 2024
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Alessandro Montenegro
Marco Mussi
Alberto Maria Metelli
Matteo Papini
48
2
0
03 May 2024
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
Yifan Lin
Yuhao Wang
Enlu Zhou
78
0
0
01 Mar 2024
Truncating Trajectories in Monte Carlo Reinforcement Learning
Truncating Trajectories in Monte Carlo Reinforcement Learning
Riccardo Poiani
Alberto Maria Metelli
Marcello Restelli
29
2
0
07 May 2023
Reinforcement Learning Tutor Better Supported Lower Performers in a Math
  Task
Reinforcement Learning Tutor Better Supported Lower Performers in a Math Task
S. Ruan
Allen Nie
William Steenbergen
Jiayu He
JQ Zhang
...
Kyle Dang Nguyen
Catherine Y Wang
Rui Ying
James A. Landay
Emma Brunskill
28
18
0
11 Apr 2023
Offline Policy Optimization in RL with Variance Regularizaton
Offline Policy Optimization in RL with Variance Regularizaton
Riashat Islam
Samarth Sinha
Homanga Bharadhwaj
Samin Yeasar Arnob
Zhuoran Yang
Animesh Garg
Zhaoran Wang
Lihong Li
Doina Precup
OffRL
28
0
0
29 Dec 2022
On the Reuse Bias in Off-Policy Reinforcement Learning
On the Reuse Bias in Off-Policy Reinforcement Learning
Chengyang Ying
Zhongkai Hao
Xinning Zhou
Hang Su
Dong Yan
Jun Zhu
OffRL
45
3
0
15 Sep 2022
General Policy Evaluation and Improvement by Learning to Identify Few
  But Crucial States
General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States
Francesco Faccio
Aditya A. Ramesh
Vincent Herrmann
J. Harb
Jürgen Schmidhuber
OffRL
44
8
0
04 Jul 2022
Lifelong Hyper-Policy Optimization with Multiple Importance Sampling
  Regularization
Lifelong Hyper-Policy Optimization with Multiple Importance Sampling Regularization
P. Liotet
Francesco Vidaich
Alberto Maria Metelli
Marcello Restelli
OffRL
14
8
0
13 Dec 2021
Policy Optimization as Online Learning with Mediator Feedback
Policy Optimization as Online Learning with Mediator Feedback
Alberto Maria Metelli
Matteo Papini
P. DÓro
Marcello Restelli
OffRL
27
10
0
15 Dec 2020
Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State
  Entropy Estimate
Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate
Mirco Mutti
Lorenzo Pratissoli
Marcello Restelli
11
19
0
09 Jul 2020
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement
  Learning
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Ali Mousavi
Lihong Li
Qiang Liu
Denny Zhou
OffRL
27
32
0
24 Mar 2020
POPCORN: Partially Observed Prediction COnstrained ReiNforcement
  Learning
POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning
Joseph D. Futoma
M. C. Hughes
Finale Doshi-Velez
OffRL
21
49
0
13 Jan 2020
Model Inversion Networks for Model-Based Optimization
Model Inversion Networks for Model-Based Optimization
Aviral Kumar
Sergey Levine
OffRL
38
93
0
31 Dec 2019
Understanding the Curse of Horizon in Off-Policy Evaluation via
  Conditional Importance Sampling
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Yao Liu
Pierre-Luc Bacon
Emma Brunskill
OffRL
22
45
0
15 Oct 2019
Sparse tree search optimality guarantees in POMDPs with continuous
  observation spaces
Sparse tree search optimality guarantees in POMDPs with continuous observation spaces
M. Saint-Guillain
T. Vaquero
Steve Ankuo Chien
14
24
0
10 Oct 2019
Sample Efficient Policy Gradient Methods with Recursive Variance
  Reduction
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
33
83
0
18 Sep 2019
Gradient-Aware Model-based Policy Search
Gradient-Aware Model-based Policy Search
P. DÓro
Alberto Maria Metelli
Andrea Tirinzoni
Matteo Papini
Marcello Restelli
29
34
0
09 Sep 2019
An Improved Convergence Analysis of Stochastic Variance-Reduced Policy
  Gradient
An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient
Pan Xu
F. Gao
Quanquan Gu
16
93
0
29 May 2019
Trajectory-Based Off-Policy Deep Reinforcement Learning
Trajectory-Based Off-Policy Deep Reinforcement Learning
Andreas Doerr
Michael Volpp
Marc Toussaint
Sebastian Trimpe
Christian Daniel
OffRL
29
2
0
14 May 2019
Diagnosing Bottlenecks in Deep Q-learning Algorithms
Diagnosing Bottlenecks in Deep Q-learning Algorithms
Justin Fu
Aviral Kumar
Matthew Soh
Sergey Levine
OffRL
19
141
0
26 Feb 2019
1