Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,130 papers shown
Title
PRUDEX-Compass: Towards Systematic Evaluation of Reinforcement Learning in Financial Markets
Shuo Sun
Molei Qin
Xinrun Wang
Bo An
FaML
OffRL
AIFin
96
5
0
14 Jan 2023
World Models and Predictive Coding for Cognitive and Developmental Robotics: Frontiers and Challenges
T. Taniguchi
Shingo Murata
Masahiro Suzuki
D. Ognibene
Pablo Lanillos
...
L. Jamone
Tomoaki Nakamura
Alejandra Ciria
B. Lara
G. Pezzulo
105
57
0
14 Jan 2023
Learning to Control and Coordinate Mixed Traffic Through Robot Vehicles at Complex and Unsignalized Intersections
Dawei Wang
Weizi Li
Lei Zhu
Jia Pan
76
16
0
12 Jan 2023
Efficient Preference-Based Reinforcement Learning Using Learned Dynamics Models
Yi Liu
Gaurav Datta
Ellen R. Novoseller
Daniel S. Brown
118
24
0
11 Jan 2023
Mastering Diverse Domains through World Models
Danijar Hafner
J. Pašukonis
Jimmy Ba
Timothy Lillicrap
121
617
0
10 Jan 2023
Hint assisted reinforcement learning: an application in radio astronomy
S. Yatawatta
148
1
0
10 Jan 2023
Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework
Zongwei Liu
Yonghong Song
Yuanlin Zhang
OffRL
84
3
0
10 Jan 2023
Sequential Fair Resource Allocation under a Markov Decision Process Framework
Parisa Hassanzadeh
Eleonora Kreacic
Sihan Zeng
Yuchen Xiao
Sumitra Ganesh
46
3
0
10 Jan 2023
Sample-efficient Surrogate Model for Frequency Response of Linear PDEs using Self-Attentive Complex Polynomials
A. Cohen
W. Dou
Jiang Zhu
S. Koziel
Péter Renner
J. Mattsson
Xiaomeng Yang
Beidi Chen
Kevin R. Stone
Yuandong Tian
50
0
0
06 Jan 2023
Centralized Cooperative Exploration Policy for Continuous Control Tasks
Chong Li
Chen Gong
Qiang He
Xinwen Hou
Yu Liu
98
1
0
06 Jan 2023
Extreme Q-Learning: MaxEnt RL without Entropy
Divyansh Garg
Joey Hejna
Matthieu Geist
Stefano Ermon
OffRL
103
80
0
05 Jan 2023
Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping
Lina Mezghani
Sainbayar Sukhbaatar
Piotr Bojanowski
A. Lazaric
Alahari Karteek
OffRL
139
19
0
05 Jan 2023
Learning a Generic Value-Selection Heuristic Inside a Constraint Programming Solver
Tom Marty
Tristan François
Pierre Tessier
Louis Gautier
Louis-Martin Rousseau
Quentin Cappart
108
7
0
05 Jan 2023
Contextual Conservative Q-Learning for Offline Reinforcement Learning
Ke Jiang
Jiayu Yao
Xiaoyang Tan
OffRL
48
0
0
03 Jan 2023
A Policy Optimization Method Towards Optimal-time Stability
Shengjie Wang
Lan Fengb
Xiang Zheng
Yu-wen Cao
Oluwatosin Oseni
Haotian Xu
Tao Zhang
Yang Gao
112
1
0
02 Jan 2023
Optimization of Image Transmission in a Cooperative Semantic Communication Networks
Wenjing Zhang
Yining Wang
Mingzhe Chen
Tao Luo
Dusit Niyato
59
45
0
01 Jan 2023
Goal-Guided Transformer-Enabled Reinforcement Learning for Efficient Autonomous Navigation
Wenhui Huang
Yanxin Zhou
Xiangkun He
Chengqi Lv
74
32
0
01 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
208
36
0
01 Jan 2023
MERLIN: Multi-agent offline and transfer learning for occupant-centric energy flexible operation of grid-interactive communities using smart meter data and CityLearn
Kingsley Nweye
S. Sankaranarayanan
Zoltán Nagy
OffRL
AI4CE
64
27
0
31 Dec 2022
A Mapping of Assurance Techniques for Learning Enabled Autonomous Systems to the Systems Engineering Lifecycle
Christian Ellis
Maggie B. Wigness
L. Fiondella
72
1
0
30 Dec 2022
Learning from Guided Play: Improving Exploration for Adversarial Imitation Learning with Simple Auxiliary Tasks
Trevor Ablett
Bryan Chan
Jonathan Kelly
150
10
0
30 Dec 2022
Online learning techniques for prediction of temporal tabular datasets with regime changes
Thomas Wong
Mauricio Barahona
OOD
AI4TS
85
1
0
30 Dec 2022
Hybrid Deep Reinforcement Learning and Planning for Safe and Comfortable Automated Driving
Dikshant Gupta
Matthias Klusch
81
2
0
30 Dec 2022
Offline Policy Optimization in RL with Variance Regularizaton
Riashat Islam
Samarth Sinha
Homanga Bharadhwaj
Samin Yeasar Arnob
Zhuoran Yang
Animesh Garg
Zhaoran Wang
Lihong Li
Doina Precup
OffRL
67
0
0
29 Dec 2022
On the Geometry of Reinforcement Learning in Continuous State and Action Spaces
Saket Tiwari
Omer Gottesman
George Konidaris
74
0
0
29 Dec 2022
Policy Optimization to Learn Adaptive Motion Primitives in Path Planning with Dynamic Obstacles
Brian Angulo
Aleksandr I. Panov
Konstantin Yakovlev
79
12
0
29 Dec 2022
Tuning Synaptic Connections instead of Weights by Genetic Algorithm in Spiking Policy Network
Duzhen Zhang
Tielin Zhang
Shuncheng Jia
Qingyu Wang
Bo Xu
OffRL
381
5
0
29 Dec 2022
Backward Curriculum Reinforcement Learning
Kyungmin Ko
OnRL
55
0
0
29 Dec 2022
Towards automating Codenames spymasters with deep reinforcement learning
Sherman Siu
73
2
0
28 Dec 2022
Representation Learning in Deep RL via Discrete Information Bottleneck
Riashat Islam
Hongyu Zang
Manan Tomar
Aniket Didolkar
Md. Mofijul Islam
...
Tariq Iqbal
Xin-hui Li
Anirudh Goyal
N. Heess
Alex Lamb
SSL
OffRL
71
8
0
28 Dec 2022
Deep Reinforcement Learning for Wind and Energy Storage Coordination in Wholesale Energy and Ancillary Service Markets
Jinhao Li
Changlong Wang
Hao Wang
24
9
0
27 Dec 2022
Off-Policy Reinforcement Learning with Loss Function Weighted by Temporal Difference Error
Bumgeun Park
Taeyoung Kim
Woohyeon Moon
L. Vecchietti
Dongsoo Har
OffRL
67
2
0
26 Dec 2022
Learning Generalizable Representations for Reinforcement Learning via Adaptive Meta-learner of Behavioral Similarities
Jianda Chen
Sinno Jialin Pan
SSL
61
6
0
26 Dec 2022
SHIRO: Soft Hierarchical Reinforcement Learning
Kandai Watanabe
Mathew Strong
Omer Eldar
75
1
0
24 Dec 2022
NARS vs. Reinforcement learning: ONA vs. Q-Learning
Ali Beikmohammadi
113
0
0
23 Dec 2022
Investigation of reinforcement learning for shape optimization of profile extrusion dies
C. Fricke
D. Wolff
Marco Kemmerling
S. Elgeti
OffRL
19
5
0
23 Dec 2022
Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios
Yiren Lu
Justin Fu
George Tucker
Xinlei Pan
Eli Bronstein
...
Brandyn White
Aleksandra Faust
Shimon Whiteson
Drago Anguelov
Sergey Levine
OffRL
111
97
0
21 Dec 2022
Lifelong Reinforcement Learning with Modulating Masks
Eseoghene Ben-Iwhiwhu
Saptarshi Nath
Praveen K. Pilly
Soheil Kolouri
Andrea Soltoggio
CLL
OffRL
106
23
0
21 Dec 2022
Reward Bonuses with Gain Scheduling Inspired by Iterative Deepening Search
Taisuke Kobayashi
93
1
0
21 Dec 2022
Variational Quantum Soft Actor-Critic for Robotic Arm Control
Alberto Acuto
Paola Barilla
Ludovico Bozzolo
Matteo Conterno
Mattia Pavese
A. Policicchio
86
9
0
20 Dec 2022
Learning Latent Representations to Co-Adapt to Humans
Sagar Parekh
Dylan P. Losey
97
12
0
19 Dec 2022
Near-optimal Policy Identification in Active Reinforcement Learning
Xiang Li
Viraj Mehta
Johannes Kirschner
I. Char
Willie Neiswanger
J. Schneider
Andreas Krause
Ilija Bogunovic
OffRL
89
6
0
19 Dec 2022
Risk-Sensitive Reinforcement Learning with Exponential Criteria
Erfaun Noorani
Christos N. Mavridis
John S. Baras
109
9
0
18 Dec 2022
Enhancing Cyber Resilience of Networked Microgrids using Vertical Federated Reinforcement Learning
Sayak Mukherjee
Ramij-Raja Hossain
Yuan Liu
W. Du
Veronica Adetola
Sheik M. Mohiuddin
Qiuhua Huang
Tianzhixi Yin
Ankit Singhal
61
5
0
17 Dec 2022
Training Robots to Evaluate Robots: Example-Based Interactive Reward Functions for Policy Learning
Kun-Yen Huang
E. Hu
Dinesh Jayaraman
OffRL
111
5
0
17 Dec 2022
Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning
Zhecheng Yuan
Zhengrong Xue
Bo Yuan
Xueqian Wang
Yi Wu
Yang Gao
Huazhe Xu
SSL
OffRL
120
74
0
17 Dec 2022
Latent Variable Representation for Reinforcement Learning
Zhaolin Ren
Chenjun Xiao
Tianjun Zhang
Na Li
Zhaoran Wang
Sujay Sanghavi
Dale Schuurmans
Bo Dai
OffRL
106
10
0
17 Dec 2022
Safe Evaluation For Offline Learning: Are We Ready To Deploy?
Hager Radi
Josiah P. Hanna
Peter Stone
Matthew E. Taylor
OffRL
ELM
77
0
0
16 Dec 2022
A Simple Decentralized Cross-Entropy Method
Zichen Zhang
Jun Jin
Martin Jägersand
Jun Luo
Dale Schuurmans
53
10
0
16 Dec 2022
Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies
Shivakanth Sujit
Pedro H. M. Braga
J. Bornschein
Samira Ebrahimi Kahou
OffRL
78
1
0
15 Dec 2022
Previous
1
2
3
...
39
40
41
...
81
82
83
Next