Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1702.08165
Cited By
Reinforcement Learning with Deep Energy-Based Policies
27 February 2017
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reinforcement Learning with Deep Energy-Based Policies"
48 / 48 papers shown
Title
MARFT: Multi-Agent Reinforcement Fine-Tuning
Junwei Liao
Muning Wen
Jun Wang
Weinan Zhang
OffRL
96
3
0
21 Apr 2025
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Taiwei Shi
Yiyang Wu
Linxin Song
Dinesh Manocha
Jieyu Zhao
LRM
119
9
0
07 Apr 2025
Safe Explicable Policy Search
Akkamahadevi Hanni
Jonathan Montaño
Yu Zhang
92
0
0
10 Mar 2025
Maximum Entropy Reinforcement Learning with Diffusion Policy
Xiaoyi Dong
Jian Cheng
Xinsong Zhang
86
1
0
17 Feb 2025
Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network
Jijia Liu
Feng Gao
Q. Liao
Chao Yu
Yu Wang
OffRL
113
0
0
01 Feb 2025
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization
Zishun Yu
Tengyu Xu
Di Jin
Karthik Abinav Sankararaman
Yun He
...
Eryk Helenowski
Chen Zhu
Sinong Wang
Hao Ma
Han Fang
LRM
163
9
0
29 Jan 2025
Divergence-Augmented Policy Optimization
Qing Wang
Yingru Li
Jiechao Xiong
Tong Zhang
OffRL
141
16
0
28 Jan 2025
On Reward Transferability in Adversarial Inverse Reinforcement Learning: Insights from Random Matrix Theory
Yangchun Zhang
Wang Zhou
Yirui Zhou
75
0
0
31 Dec 2024
Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation
Eliot Xing
Vernon Luk
Jean Oh
132
0
0
16 Dec 2024
Robust Contact-rich Manipulation through Implicit Motor Adaptation
Teng Xue
Amirreza Razmjoo
Suhan Shetty
Sylvain Calinon
136
1
0
16 Dec 2024
Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets
Zhen Liu
Tim Z. Xiao
Weiyang Liu
Yoshua Bengio
Dinghuai Zhang
153
5
0
10 Dec 2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
198
6
0
07 Nov 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
128
5
0
22 Oct 2024
Reward-free World Models for Online Imitation Learning
Shangzhe Li
Zhiao Huang
H. Su
OffRL
126
1
0
17 Oct 2024
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
Hyungjoo Chae
Namyoung Kim
Kai Tzu-iunn Ong
Minju Gwak
Gwanwoo Song
Jihoon Kim
Seon Gyeom Kim
Dongha Lee
Jinyoung Yeo
LLMAG
66
20
0
17 Oct 2024
Simplifying Deep Temporal Difference Learning
Matteo Gallici
Mattie Fellows
Benjamin Ellis
B. Pou
Ivan Masmitja
Jakob Foerster
Mario Martin
OffRL
104
25
0
05 Jul 2024
Residual-MPPI: Online Policy Customization for Continuous Control
Pengcheng Wang
Chenran Li
Catherine Weaver
Kenta Kawamoto
Masayoshi Tomizuka
Chen Tang
Wei Zhan
OffRL
82
3
0
01 Jul 2024
Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL
Qi Lv
Xiang Deng
Gongwei Chen
Michael Yu Wang
Liqiang Nie
112
7
0
08 Jun 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
103
2
0
30 May 2024
Learning diverse attacks on large language models for robust red-teaming and safety tuning
Seanie Lee
Minsu Kim
Lynn Cherif
David Dobre
Juho Lee
...
Kenji Kawaguchi
Gauthier Gidel
Yoshua Bengio
Nikolay Malkin
Moksh Jain
AAML
112
18
0
28 May 2024
Rule-Based Lloyd Algorithm for Multi-Robot Motion Planning and Control with Safety and Convergence Guarantees
Manuel Boldrer
Álvaro Serra-Gómez
Lorenzo Lyons
Vít Krátký
Javier Alonso-Mora
Laura Ferranti
98
4
0
30 Oct 2023
Multicopy Reinforcement Learning Agents
Alicia P. Wolfe
Oliver Diamond
Brigitte Goeler-Slough
Remi Feuerman
Magdalena Kisielinska
Victoria Manfredi
69
0
0
19 Sep 2023
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
135
13
0
28 Aug 2023
Explainability in Deep Reinforcement Learning
Alexandre Heuillet
Fabien Couthouis
Natalia Díaz Rodríguez
XAI
143
281
0
15 Aug 2020
Quantum enhancements for deep reinforcement learning in large spaces
Sofiene Jerbi
Lea M. Trenkwalder
Hendrik Poulsen Nautrup
Hans J. Briegel
Vedran Dunjko
71
5
0
28 Oct 2019
Discrete Sequential Prediction of Continuous Actions for Deep RL
Luke Metz
Julian Ibarz
Navdeep Jaitly
James Davidson
BDL
OffRL
68
119
0
14 May 2017
Equivalence Between Policy Gradients and Soft Q-Learning
John Schulman
Xi Chen
Pieter Abbeel
OffRL
80
345
0
21 Apr 2017
Stochastic Neural Networks for Hierarchical Reinforcement Learning
Carlos Florensa
Yan Duan
Pieter Abbeel
BDL
78
361
0
10 Apr 2017
Stein Variational Policy Gradient
Yang Liu
Prajit Ramachandran
Qiang Liu
Jian-wei Peng
66
139
0
07 Apr 2017
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
148
470
0
28 Feb 2017
Loss is its own Reward: Self-Supervision for Reinforcement Learning
Evan Shelhamer
Parsa Mahmoudieh
Max Argus
Trevor Darrell
SSL
77
186
0
21 Dec 2016
Reinforcement Learning with Unsupervised Auxiliary Tasks
Max Jaderberg
Volodymyr Mnih
Wojciech M. Czarnecki
Tom Schaul
Joel Z Leibo
David Silver
Koray Kavukcuoglu
SSL
86
1,228
0
16 Nov 2016
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Sergey Levine
OffRL
BDL
88
345
0
07 Nov 2016
Learning to Draw Samples: With Application to Amortized MLE for Generative Adversarial Learning
Dilin Wang
Qiang Liu
GAN
BDL
112
119
0
06 Nov 2016
Combining policy gradient and Q-learning
Brendan O'Donoghue
Rémi Munos
Koray Kavukcuoglu
Volodymyr Mnih
OffRL
OnRL
66
139
0
05 Nov 2016
Learning and Transfer of Modulated Locomotor Controllers
N. Heess
Greg Wayne
Yuval Tassa
Timothy Lillicrap
Martin Riedmiller
David Silver
65
207
0
17 Oct 2016
Energy-based Generative Adversarial Network
Jiaqi Zhao
Michaël Mathieu
Yann LeCun
GAN
133
1,114
0
11 Sep 2016
Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm
Qiang Liu
Dilin Wang
BDL
65
1,091
0
16 Aug 2016
Deep Directed Generative Models with Energy-Based Probability Estimation
Taesup Kim
Yoshua Bengio
GAN
45
136
0
10 Jun 2016
Continuous Deep Q-Learning with Model-based Acceleration
S. Gu
Timothy Lillicrap
Ilya Sutskever
Sergey Levine
86
1,012
0
02 Mar 2016
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
189
8,833
0
04 Feb 2016
Taming the Noise in Reinforcement Learning via Soft Updates
Roy Fox
Ari Pakman
Naftali Tishby
67
338
0
28 Dec 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
306
13,214
0
09 Sep 2015
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
82
3,399
0
08 Jun 2015
End-to-End Training of Deep Visuomotor Policies
Sergey Levine
Chelsea Finn
Trevor Darrell
Pieter Abbeel
BDL
284
3,431
0
02 Apr 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
274
6,755
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.5K
149,842
0
22 Dec 2014
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
114
12,201
0
19 Dec 2013
1