ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1805.00909
  4. Cited By
Reinforcement Learning and Control as Probabilistic Inference: Tutorial
  and Review

Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review

2 May 2018
Sergey Levine
    AI4CE
    BDL
ArXivPDFHTML

Papers citing "Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review"

50 / 123 papers shown
Title
CHD: Coupled Hierarchical Diffusion for Long-Horizon Tasks
CHD: Coupled Hierarchical Diffusion for Long-Horizon Tasks
Ce Hao
Anxing Xiao
Zhiwei Xue
Harold Soh
49
0
0
12 May 2025
Toward Efficient Exploration by Large Language Model Agents
Toward Efficient Exploration by Large Language Model Agents
Dilip Arumugam
Thomas L. Griffiths
LLMAG
94
1
0
29 Apr 2025
Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization
Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization
Emiliano Penaloza
Tianyue H. Zhan
Laurent Charlin
Mateo Espinosa Zarlenga
51
0
0
25 Apr 2025
Trust-Region Twisted Policy Improvement
Trust-Region Twisted Policy Improvement
Joery A. de Vries
Jinke He
Yaniv Oren
M. Spaan
OffRL
LRM
35
0
0
08 Apr 2025
Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning
Zhenghai Xue
Lang Feng
Jiacheng Xu
Kang Kang
Xiang Wen
Jingyi Wang
Shuicheng Yan
OffRL
53
0
0
10 Mar 2025
Advancing Autonomous VLM Agents via Variational Subgoal-Conditioned Reinforcement Learning
Advancing Autonomous VLM Agents via Variational Subgoal-Conditioned Reinforcement Learning
Qingyuan Wu
Jianheng Liu
Haifeng Zhang
Jun Wang
Kun Shao
OffRL
107
0
0
11 Feb 2025
Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability
Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability
Qingyue Zhao
Kaixuan Ji
Heyang Zhao
Tong Zhang
Q. Gu
OffRL
45
0
0
09 Feb 2025
DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning
DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning
Utsav Singh
Souradip Chakraborty
Wesley A Suttle
Brian M. Sadler
Vinay P. Namboodiri
Amrit Singh Bedi
OffRL
53
0
0
03 Jan 2025
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Zhen Liu
Tim Z. Xiao
Weiyang Liu
Yoshua Bengio
Dinghuai Zhang
123
2
0
10 Dec 2024
Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation
Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation
Huy Le
Miroslav Gabriel
Tai Hoang
Gerhard Neumann
Ngo Anh Vien
114
1
0
22 Nov 2024
Doubly Optimal Policy Evaluation for Reinforcement Learning
Doubly Optimal Policy Evaluation for Reinforcement Learning
Shuze Liu
Claire Chen
Shangtong Zhang
OffRL
40
2
0
03 Oct 2024
Average-Reward Maximum Entropy Reinforcement Learning for Underactuated
  Double Pendulum Tasks
Average-Reward Maximum Entropy Reinforcement Learning for Underactuated Double Pendulum Tasks
Jean Seong Bjorn Choe
Bumkyu Choi
Jong-kook Kim
43
2
0
13 Sep 2024
Learning Causally Invariant Reward Functions from Diverse Demonstrations
Learning Causally Invariant Reward Functions from Diverse Demonstrations
Ivan Ovinnikov
Eugene Bykovets
J. M. Buhmann
CML
40
0
0
12 Sep 2024
Enhanced Safety in Autonomous Driving: Integrating Latent State
  Diffusion Model for End-to-End Navigation
Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation
Detian Chu
Linyuan Bai
Jianuo Huang
Zhenlong Fang
Peng Zhang
Wei Kang
Haifeng Lin
45
2
0
08 Jul 2024
Variational Best-of-N Alignment
Variational Best-of-N Alignment
Afra Amini
Tim Vieira
Ryan Cotterell
Ryan Cotterell
BDL
43
18
0
08 Jul 2024
What type of inference is planning?
What type of inference is planning?
Miguel Lázaro-Gredilla
Li Yang Ku
Kevin P. Murphy
Dileep George
31
2
0
25 Jun 2024
M-HOF-Opt: Multi-Objective Hierarchical Output Feedback Optimization via Multiplier Induced Loss Landscape Scheduling
M-HOF-Opt: Multi-Objective Hierarchical Output Feedback Optimization via Multiplier Induced Loss Landscape Scheduling
Xudong Sun
Nutan Chen
Alexej Gossmann
Yu Xing
Carla Feistner
...
Felix Drost
Daniele Scarcella
Lisa Beer
Carsten Marr
Carsten Marr
59
1
0
20 Mar 2024
Leveraging Approximate Model-based Shielding for Probabilistic Safety
  Guarantees in Continuous Environments
Leveraging Approximate Model-based Shielding for Probabilistic Safety Guarantees in Continuous Environments
Alexander W. Goodall
Francesco Belardinelli
OffRL
33
1
0
01 Feb 2024
Model Predictive Inferential Control of Neural State-Space Models for Autonomous Vehicle Motion Planning
Model Predictive Inferential Control of Neural State-Space Models for Autonomous Vehicle Motion Planning
Iman Askari
Xumein Tu
Shen Zeng
Shen Zeng
Huazhen Fang
24
5
0
12 Oct 2023
Confronting Reward Model Overoptimization with Constrained RLHF
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
39
48
0
06 Oct 2023
Accelerating optimization over the space of probability measures
Accelerating optimization over the space of probability measures
Shi Chen
Wenxuan Wu
Yuhang Yao
Stephen J. Wright
32
5
0
06 Oct 2023
A General Offline Reinforcement Learning Framework for Interactive
  Recommendation
A General Offline Reinforcement Learning Framework for Interactive Recommendation
Teng Xiao
Donglin Wang
OffRL
34
73
0
01 Oct 2023
Recent Advances in Path Integral Control for Trajectory Optimization: An
  Overview in Theoretical and Algorithmic Perspectives
Recent Advances in Path Integral Control for Trajectory Optimization: An Overview in Theoretical and Algorithmic Perspectives
Muhammad Kazim
JunGee Hong
Min-Gyeom Kim
Kwang-Ki K. Kim
39
16
0
22 Sep 2023
Foundational Policy Acquisition via Multitask Learning for Motor Skill Generation
Foundational Policy Acquisition via Multitask Learning for Motor Skill Generation
Satoshi Yamamori
Jun Morimoto
26
0
0
31 Aug 2023
World-Model-Based Control for Industrial box-packing of Multiple Objects
  using NewtonianVAE
World-Model-Based Control for Industrial box-packing of Multiple Objects using NewtonianVAE
Yusuke Kato
Ryogo Okumura
T. Taniguchi
DRL
27
1
0
04 Aug 2023
Probabilistic Constrained Reinforcement Learning with Formal
  Interpretability
Probabilistic Constrained Reinforcement Learning with Formal Interpretability
Yanran Wang
Qiuchen Qian
David E. Boyle
16
4
0
13 Jul 2023
Control as Probabilistic Inference as an Emergent Communication
  Mechanism in Multi-Agent Reinforcement Learning
Control as Probabilistic Inference as an Emergent Communication Mechanism in Multi-Agent Reinforcement Learning
Tomoaki Nakamura
Akira Taniguchi
T. Taniguchi
AI4CE
13
1
0
11 Jul 2023
Safe Offline Reinforcement Learning with Real-Time Budget Constraints
Safe Offline Reinforcement Learning with Real-Time Budget Constraints
Qian Lin
Bo Tang
Zifan Wu
Chao Yu
Shangqin Mao
Qianlong Xie
Xingxing Wang
Dong Wang
OffRL
34
11
0
01 Jun 2023
Bayesian Reinforcement Learning with Limited Cognitive Load
Bayesian Reinforcement Learning with Limited Cognitive Load
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
OffRL
34
8
0
05 May 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and
  Global Optimality
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
33
0
0
22 Mar 2023
Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint
Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint
Taisuke Kobayashi
46
3
0
08 Mar 2023
On Pathologies in KL-Regularized Reinforcement Learning from Expert
  Demonstrations
On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations
Tim G. J. Rudner
Cong Lu
Michael A. Osborne
Yarin Gal
Yee Whye Teh
OffRL
33
27
0
28 Dec 2022
Hierarchical Policy Blending As Optimal Transport
Hierarchical Policy Blending As Optimal Transport
An T. Le
Kay Hansel
Jan Peters
Georgia Chalvatzaki
OT
40
7
0
04 Dec 2022
Utilizing Prior Solutions for Reward Shaping and Composition in
  Entropy-Regularized Reinforcement Learning
Utilizing Prior Solutions for Reward Shaping and Composition in Entropy-Regularized Reinforcement Learning
Jacob Adamczyk
A. Arriojas
Stas Tiomkin
R. Kulkarni
45
8
0
02 Dec 2022
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement
  Learning
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning
Dilip Arumugam
Mark K. Ho
Noah D. Goodman
Benjamin Van Roy
31
4
0
30 Oct 2022
Implicit Offline Reinforcement Learning via Supervised Learning
Implicit Offline Reinforcement Learning via Supervised Learning
Alexandre Piché
Rafael Pardiñas
David Vazquez
Igor Mordatch
C. Pal
SSL
OffRL
29
4
0
21 Oct 2022
Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with
  Gaussian Processes
Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes
Joe Watson
Jan Peters
26
15
0
07 Oct 2022
Variational Inference for Model-Free and Model-Based Reinforcement
  Learning
Variational Inference for Model-Free and Model-Based Reinforcement Learning
Felix Leibfried
OffRL
20
0
0
04 Sep 2022
MPPI-IPDDP: Hybrid Method of Collision-Free Smooth Trajectory Generation
  for Autonomous Robots
MPPI-IPDDP: Hybrid Method of Collision-Free Smooth Trajectory Generation for Autonomous Robots
Mingeuk Kim
Kwang-Ki K. Kim
34
3
0
04 Aug 2022
Language Model Cascades
Language Model Cascades
David Dohan
Winnie Xu
Aitor Lewkowycz
Jacob Austin
David Bieber
...
Henryk Michalewski
Rif A. Saurous
Jascha Narain Sohl-Dickstein
Kevin Patrick Murphy
Charles Sutton
ReLM
LRM
38
99
0
21 Jul 2022
Successor Representation Active Inference
Successor Representation Active Inference
Beren Millidge
Christopher L. Buckley
BDL
30
3
0
20 Jul 2022
Minimum Description Length Control
Minimum Description Length Control
Theodore H. Moskovitz
Ta-Chu Kao
M. Sahani
M. Botvinick
26
1
0
17 Jul 2022
Low Emission Building Control with Zero-Shot Reinforcement Learning
Low Emission Building Control with Zero-Shot Reinforcement Learning
Scott Jeen
Alessandro Abate
Jonathan M. Cullen
AI4CE
19
5
0
28 Jun 2022
Bounding Evidence and Estimating Log-Likelihood in VAE
Bounding Evidence and Estimating Log-Likelihood in VAE
Lukasz Struski
Marcin Mazur
Pawel Batorski
Przemysław Spurek
Jacek Tabor
21
3
0
19 Jun 2022
Intra-agent speech permits zero-shot task acquisition
Intra-agent speech permits zero-shot task acquisition
Chen Yan
Federico Carnevale
Petko Georgiev
Adam Santoro
Aurelia Guy
Alistair Muldal
Chia-Chun Hung
Josh Abramson
Timothy Lillicrap
Greg Wayne
LM&Ro
36
9
0
07 Jun 2022
On Reinforcement Learning and Distribution Matching for Fine-Tuning
  Language Models with no Catastrophic Forgetting
On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak
Hady ElSahar
Germán Kruszewski
Marc Dymetman
CLL
25
51
0
01 Jun 2022
Critic Sequential Monte Carlo
Critic Sequential Monte Carlo
Vasileios Lioutas
J. Lavington
Justice Sefas
Matthew Niedoba
Yunpeng Liu
Berend Zwartsenberg
Setareh Dabiri
Frank Wood
Adam Scibior
50
7
0
30 May 2022
Planning with Diffusion for Flexible Behavior Synthesis
Planning with Diffusion for Flexible Behavior Synthesis
Michael Janner
Yilun Du
J. Tenenbaum
Sergey Levine
DiffM
202
633
0
20 May 2022
How to Spend Your Robot Time: Bridging Kickstarting and Offline
  Reinforcement Learning for Vision-based Robotic Manipulation
How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation
Alex X. Lee
Coline Devin
Jost Tobias Springenberg
Yuxiang Zhou
Thomas Lampe
A. Abdolmaleki
Konstantinos Bousmalis
OffRL
OnRL
24
15
0
06 May 2022
Exploration in Deep Reinforcement Learning: A Survey
Exploration in Deep Reinforcement Learning: A Survey
Pawel Ladosz
Lilian Weng
Minwoo Kim
H. Oh
OffRL
26
324
0
02 May 2022
123
Next