Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
v1
v2 (latest)
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 4,128 papers shown
Title
Active Perception for Tactile Sensing: A Task-Agnostic Attention-Based Approach
Tim Schneider
Cristiana de Farias
Roberto Calandra
Lawrence Yunliang Chen
Jan Peters
460
1
0
09 May 2025
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
Jingzhong Lin
Yuanyuan Qi
Xinru Li
Wenxuan Huang
Xiangfeng Xu
Bangyan Li
Xuejiao Wang
Gaoqi He
70
0
0
08 May 2025
Taming OOD Actions for Offline Reinforcement Learning: An Advantage-Based Approach
Xuyang Chen
Keyu Yan
Lin Zhao
OffRL
126
1
0
08 May 2025
A critical assessment of reinforcement learning methods for microswimmer navigation in complex flows
Selim Mecanna
Aurore Loisy
Christophe Eloy
96
0
0
08 May 2025
Optimization of Infectious Disease Intervention Measures Based on Reinforcement Learning - Empirical analysis based on UK COVID-19 epidemic data
Baida Zhang
Yakai Chen
Huichun Li
Zhenghu Zu
63
0
0
07 May 2025
Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation
Abdulaziz Almuzairee
Rohan Patil
Dwait Bhatt
Henrik I. Christensen
89
0
0
07 May 2025
A Two-Timescale Primal-Dual Framework for Reinforcement Learning via Online Dual Variable Guidance
Axel Friedrich Wolter
Tobias Sutter
OffRL
96
0
0
07 May 2025
Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems
Matthew Sgambati
Aleksandar Vakanski
Matthew Anderson
47
0
0
06 May 2025
Joint Resource Management for Energy-efficient UAV-assisted SWIPT-MEC: A Deep Reinforcement Learning Approach
Yue Chen
Hui Kang
Jiahui Li
Geng Sun
Boxiong Wang
Jiacheng Wang
Cong Liang
Shuang Liang
Dusit Niyato
242
0
0
06 May 2025
Policy-labeled Preference Learning: Is Preference Enough for RLHF?
Taehyun Cho
Seokhun Ju
Seungyub Han
Dohyeong Kim
Kyungjae Lee
Jungwoo Lee
OffRL
125
0
0
06 May 2025
Aerodynamic and structural airfoil shape optimisation via Transfer Learning-enhanced Deep Reinforcement Learning
David Ramos
Lucas Lacasa
E. Valero
G. Rubio
AI4CE
113
0
0
05 May 2025
Zero-shot Sim2Real Transfer for Magnet-Based Tactile Sensor on Insertion Tasks
Beining Han
Abhishek Joshi
Jia Deng
83
0
0
05 May 2025
A Goal-Oriented Reinforcement Learning-Based Path Planning Algorithm for Modular Self-Reconfigurable Satellites
Bofei Liu
Dong Ye
Zunhao Yao
Zhaowei Sun
63
0
0
04 May 2025
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Jifeng Hu
Sili Huang
Zhiyong Yang
Shengchao Hu
Li Shen
Hechang Chen
Lichao Sun
Yi-Ju Chang
Dacheng Tao
OffRL
429
0
0
03 May 2025
Skill-based Safe Reinforcement Learning with Risk Planning
Hanping Zhang
Yuhong Guo
OffRL
OnRL
93
0
0
02 May 2025
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng
Weihao Tan
Zhiyi Lyu
Longtao Zheng
Haiyang Xu
Ming Yan
Fei Huang
Jingyi Wang
69
0
0
01 May 2025
Multi-Constraint Safe Reinforcement Learning via Closed-form Solution for Log-Sum-Exp Approximation of Control Barrier Functions
Chenggang Wang
Xinyi Wang
Yutong Dong
Lei Song
Xinping Guan
77
0
0
01 May 2025
Fine-Tuning without Performance Degradation
Han Wang
Adam White
Martha White
OnRL
438
0
0
01 May 2025
A General Approach of Automated Environment Design for Learning the Optimal Power Flow
Thomas Wolgast
Astrid Nieße
AI4CE
67
0
0
01 May 2025
Implicit Neural-Representation Learning for Elastic Deformable-Object Manipulations
Minseok Song
JeongHo Ha
Bonggyeong Park
Daehyung Park
453
0
0
01 May 2025
Uncertainty-aware Latent Safety Filters for Avoiding Out-of-Distribution Failures
Junwon Seo
Kensuke Nakamura
Andrea V. Bajcsy
125
0
0
01 May 2025
Wasserstein Policy Optimization
David Pfau
Ian Davies
Diana Borsa
Joao G. M. Araujo
Brendan D. Tracey
H. V. Hasselt
91
1
0
01 May 2025
Neuro-Symbolic Generation of Explanations for Robot Policies with Weighted Signal Temporal Logic
Mikihisa Yuasa
R. Sreenivas
Huy T. Tran
76
0
0
30 Apr 2025
Improvements of Dark Experience Replay and Reservoir Sampling towards Better Balance between Consolidation and Plasticity
Taisuke Kobayashi
CLL
67
0
0
29 Apr 2025
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Wenjun Cao
91
0
0
26 Apr 2025
Learning from Less: SINDy Surrogates in RL
Aniket Dixit
Muhammad Ibrahim Khan
Faizan Ahmed
James Brusey
52
0
0
25 Apr 2025
Depth-Constrained ASV Navigation with Deep RL and Limited Sensing
Amirhossein Zhalehmehrabi
Daniele Meli
Francesco Dal Santo
Francesco Trotti
Alessandro Farinelli
60
0
0
25 Apr 2025
CaRL: Learning Scalable Planning Policies with Simple Rewards
Bernhard Jaeger
D. Dauner
Jens Beißwenger
Simon Gerstenecker
Kashyap Chitta
Andreas Geiger
131
2
0
24 Apr 2025
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
Zihan Wang
Kaidi Wang
Q. Wang
Pingyue Zhang
Linjie Li
...
Jiajun Wu
L. Fei-Fei
Lijuan Wang
Yejin Choi
Manling Li
276
30
0
24 Apr 2025
Plasticine: Accelerating Research in Plasticity-Motivated Deep Reinforcement Learning
Mingqi Yuan
Qi Wang
Guozheng Ma
Yue Liu
Xin Jin
Yunbo Wang
Xiaokang Yang
Wenjun Zeng
D. Tao
OffRL
AI4CE
109
0
0
24 Apr 2025
HERB: Human-augmented Efficient Reinforcement learning for Bin-packing
Gojko Perovic
Nuno Ferreira Duarte
Atabak Dehban
Gonçalo Teixeira
Egidio Falotico
J. Santos-Victor
OffRL
37
0
0
23 Apr 2025
Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator
Chenhao Li
Andreas Krause
Marco Hutter
OffRL
56
0
0
23 Apr 2025
Policy-Based Radiative Transfer: Solving the
2
2
2
-Level Atom Non-LTE Problem using Soft Actor-Critic Reinforcement Learning
Brandon Panos
Ivan Milic
OffRL
61
0
0
22 Apr 2025
Grasping Deformable Objects via Reinforcement Learning with Cross-Modal Attention to Visuo-Tactile Inputs
Yonghyun Lee
Sungeun Hong
Min-gu Kim
Gyeonghwan Kim
Changjoo Nam
80
0
0
22 Apr 2025
Learning to Reason under Off-Policy Guidance
Jianhao Yan
Yafu Li
Zican Hu
Zhi Wang
Ganqu Cui
Xiaoye Qu
Yu Cheng
Yue Zhang
OffRL
LRM
191
17
0
21 Apr 2025
Text-to-Decision Agent: Offline Meta-Reinforcement Learning from Natural Language Supervision
Shilin Zhang
Zican Hu
Wenhao Wu
Xinyi Xie
Jianxiang Tang
Chunlin Chen
Daoyi Dong
Yu Cheng
Zhenhong Sun
Zhi Wang
OffRL
450
0
0
21 Apr 2025
Surrogate Fitness Metrics for Interpretable Reinforcement Learning
Philipp Altmann
Céline Davignon
Maximilian Zorn
Fabian Ritz
Claudia Linnhoff-Popien
Thomas Gabor
83
0
0
20 Apr 2025
Coordinating Spinal and Limb Dynamics for Enhanced Sprawling Robot Mobility
Merve Atasever
Ali Okhovat
Azhang Nazaripouya
John Nisbet
Omer Kurkutlu
Jyotirmoy V. Deshmukh
Yasemin Ozkan Aydin
29
0
0
18 Apr 2025
Evolutionary Policy Optimization
Zelal Su "Lain" Mustafaoglu
Keshav Pingali
Risto Miikkulainen
58
0
0
17 Apr 2025
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
Haoran Xu
Shuozhe Li
Harshit S. Sikchi
S. Niekum
Amy Zhang
OffRL
118
1
0
17 Apr 2025
VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning
Xuyang Chen
Guojian Wang
Keyu Yan
Lin Zhao
OffRL
99
1
0
16 Apr 2025
pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild
Jonas Myhre Schiøtt
Viktor Sebastian Petersen
Dimitrios P. Papadopoulos
VLM
136
0
0
16 Apr 2025
A Clean Slate for Offline Reinforcement Learning
Matthew Jackson
Uljad Berdica
Jarek Liesen
Shimon Whiteson
Jakob Foerster
OffRL
OnRL
91
1
0
15 Apr 2025
Emergence of Goal-Directed Behaviors via Active Inference with Self-Prior
Dongmin Kim
Hoshinori Kanazawa
Naoto Yoshida
Yasuo Kuniyoshi
AI4CE
67
0
0
15 Apr 2025
FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions
Daniel Marta
Simon Holk
Miguel Vasco
Jens Lundell
Timon Homberger
F. L. Busch
Olov Andersson
Jens Lundell
Iolanda Leite
141
1
0
14 Apr 2025
Moderate Actor-Critic Methods: Controlling Overestimation Bias via Expectile Loss
Ukjo Hwang
Songnam Hong
OffRL
78
0
0
14 Apr 2025
A Champion-level Vision-based Reinforcement Learning Agent for Competitive Racing in Gran Turismo 7
Hojoon Lee
Takuma Seno
Jun Jet Tai
K. Subramanian
Kenta Kawamoto
Peter Stone
Peter R. Wurman
57
0
0
12 Apr 2025
Follow the STARs: Dynamic
ω
ω
ω
-Regular Shielding of Learned Policies
Ashwani Anand
Satya Prakash Nayak
Ritam Raha
Anne-Kathrin Schmuck
50
0
0
11 Apr 2025
Neural Fidelity Calibration for Informative Sim-to-Real Adaptation
Youwei Yu
Lantao Liu
79
1
0
11 Apr 2025
Supervised Optimism Correction: Be Confident When LLMs Are Sure
Jing Zhang
Rushuai Yang
Shunyu Liu
Ting-En Lin
Fei Huang
Yi Chen
Yongqian Li
Dacheng Tao
OffRL
91
0
0
10 Apr 2025
Previous
1
2
3
4
5
...
81
82
83
Next