Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1801.01290
Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"
50 / 1,621 papers shown
Title
Multi-parameter Control for the (1+(
λ
λ
λ
,
λ
λ
λ
))-GA on OneMax via Deep Reinforcement Learning
Tai Nguyen
Phong Le
Carola Doerr
Nguyen Dang
24
0
0
19 May 2025
SAINT: Attention-Based Modeling of Sub-Action Dependencies in Multi-Action Policies
Matthew Landers
Taylor W. Killian
Thomas Hartvigsen
Afsaneh Doryab
16
0
0
17 May 2025
Bench-NPIN: Benchmarking Non-prehensile Interactive Navigation
Ninghan Zhong
Steven Caro
Avraiem Iskandar
Megnath Ramesh
Stephen L. Smith
12
0
0
17 May 2025
Q-Policy: Quantum-Enhanced Policy Evaluation for Scalable Reinforcement Learning
Kalyan Cherukuri
Aarav Lala
Yash Yardi
7
0
0
17 May 2025
Bi-Level Policy Optimization with Nyström Hypergradients
Arjun Prakash
Naicheng He
Denizalp Goktas
Amy Greenwald
12
0
0
16 May 2025
Deep Symbolic Optimization: Reinforcement Learning for Symbolic Mathematics
Conor F. Hayes
Felipe Leno Da Silva
Jiachen Yang
T. Nathan Mundhenk
Chak Shing Lee
...
Ahmet Can Solak
Thomas Desautels
Daniel Faissol
Brenden K. Petersen
Mikel Landajuela
22
0
0
16 May 2025
ReaCritic: Large Reasoning Transformer-based DRL Critic-model Scaling For Heterogeneous Networks
Feiran You
Hongyang Du
OffRL
LRM
27
0
0
16 May 2025
Exploration by Random Distribution Distillation
Zhirui Fang
Kai Yang
Jian Tao
Jiafei Lyu
Lusong Li
Li Shen
Xiu Li
14
0
0
16 May 2025
Reasoning with OmniThought: A Large CoT Dataset with Verbosity and Cognitive Difficulty Annotations
Wenrui Cai
Chengyu Wang
Junbing Yan
Jun Huang
Xiangzhong Fang
LRM
24
0
0
16 May 2025
Accelerating Visual-Policy Learning through Parallel Differentiable Simulation
Haoxiang You
Yilang Liu
Ian Abraham
22
0
0
15 May 2025
Modular Robot Control with Motor Primitives
Moses C. Nah
Johannes Lachner
Neville Hogan
21
0
0
15 May 2025
Approximated Behavioral Metric-based State Projection for Federated Reinforcement Learning
Zengxia Guo
Bohui An
Zhongqi Lu
FedML
26
0
0
15 May 2025
Knowledge capture, adaptation and composition (KCAC): A framework for cross-task curriculum learning in robotic manipulation
Xinrui Wang
Yan Jin
31
0
0
15 May 2025
General Dynamic Goal Recognition
Osher Elhadad
Reuth Mirsky
AI4CE
29
1
0
14 May 2025
Preserving Plasticity in Continual Learning with Adaptive Linearity Injection
Seyed Roozbeh Razavi Rohani
Khashayar Khajavi
Wesley Chung
Mo Chen
Sharan Vaswani
CLL
AI4CE
38
0
0
14 May 2025
Generalization in Monitored Markov Decision Processes (Mon-MDPs)
Montaser Mohammedalamen
Michael Bowling
29
0
0
13 May 2025
Adaptive Diffusion Policy Optimization for Robotic Manipulation
Huiyun Jiang
Zhuang Yang
34
0
0
13 May 2025
Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation
Enci Zhang
Xingang Yan
Wei Lin
Tianxiang Zhang
Qianchun Lu
LRM
33
0
0
13 May 2025
Continuous World Coverage Path Planning for Fixed-Wing UAVs using Deep Reinforcement Learning
Mirco Theile
Andres R. Zapata Rodriguez
Marco Caccamo
Alberto L. Sangiovanni-Vincentelli
34
0
0
13 May 2025
Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning
Xinyue Wang
Zhen Zhang
OffRL
CML
32
0
0
13 May 2025
Imagine, Verify, Execute: Memory-Guided Agentic Exploration with Vision-Language Models
Seungjae Lee
Daniel Ekpo
Haowen Liu
Furong Huang
Abhinav Shrivastava
Jia-Bin Huang
LM&Ro
45
0
0
12 May 2025
Drive Fast, Learn Faster: On-Board RL for High Performance Autonomous Racing
Benedict Hildisch
Edoardo Ghignone
Nicolas Baumann
Cheng Hu
Andrea Carron
Michele Magno
39
0
0
12 May 2025
A Reinforcement Learning Framework for Application-Specific TCP Congestion-Control
Jinming Xing
Muhammad Shahzad
19
0
0
11 May 2025
TREND: Tri-teaching for Robust Preference-based Reinforcement Learning with Demonstrations
Shuaiyi Huang
Mara Levy
Anubhav Gupta
Daniel Ekpo
Ruijie Zheng
Abhinav Shrivastava
33
0
0
09 May 2025
DAPPER: Discriminability-Aware Policy-to-Policy Preference-Based Reinforcement Learning for Query-Efficient Robot Skill Acquisition
Yuki Kadokawa
Jonas Frey
Takahiro Miki
Takamitsu Matsubara
Marco Hutter
36
0
0
09 May 2025
Active Perception for Tactile Sensing: A Task-Agnostic Attention-Based Approach
Tim Schneider
Cristiana de Farias
Roberto Calandra
L. Chen
Jan Peters
223
0
0
09 May 2025
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation
Jingzhong Lin
Yuanyuan Qi
Xinru Li
Wenxuan Huang
Xiangfeng Xu
Bangyan Li
Xuejiao Wang
Gaoqi He
33
0
0
08 May 2025
A critical assessment of reinforcement learning methods for microswimmer navigation in complex flows
Selim Mecanna
Aurore Loisy
Christophe Eloy
34
0
0
08 May 2025
Optimization of Infectious Disease Intervention Measures Based on Reinforcement Learning - Empirical analysis based on UK COVID-19 epidemic data
Baida Zhang
Yakai Chen
Huichun Li
Zhenghu Zu
34
0
0
07 May 2025
A Two-Timescale Primal-Dual Framework for Reinforcement Learning via Online Dual Variable Guidance
Axel Friedrich Wolter
Tobias Sutter
OffRL
37
0
0
07 May 2025
Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation
Abdulaziz Almuzairee
Rohan Patil
Dwait Bhatt
Henrik I. Christensen
36
0
0
07 May 2025
Policy-labeled Preference Learning: Is Preference Enough for RLHF?
Taehyun Cho
Seokhun Ju
Seungyub Han
Dohyeong Kim
Kyungjae Lee
Jungwoo Lee
OffRL
29
0
0
06 May 2025
Joint Resource Management for Energy-efficient UAV-assisted SWIPT-MEC: A Deep Reinforcement Learning Approach
Yue Chen
Hui Kang
Jiahui Li
Geng Sun
Boxiong Wang
Jiacheng Wang
Cong Liang
Shuang Liang
Dusit Niyato
49
0
0
06 May 2025
Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems
Matthew Sgambati
Aleksandar Vakanski
Matthew Anderson
37
0
0
06 May 2025
Zero-shot Sim2Real Transfer for Magnet-Based Tactile Sensor on Insertion Tasks
Beining Han
Abhishek Joshi
Jia Deng
39
0
0
05 May 2025
Aerodynamic and structural airfoil shape optimisation via Transfer Learning-enhanced Deep Reinforcement Learning
David Ramos
Lucas Lacasa
E. Valero
G. Rubio
AI4CE
32
0
0
05 May 2025
A Goal-Oriented Reinforcement Learning-Based Path Planning Algorithm for Modular Self-Reconfigurable Satellites
Bofei Liu
Dong Ye
Zunhao Yao
Zhaowei Sun
33
0
0
04 May 2025
Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning
Jifeng Hu
Sili Huang
Zheng Yang
Shengchao Hu
Li Shen
Hechang Chen
Lichao Sun
Yi-Ju Chang
Dacheng Tao
OffRL
224
0
0
03 May 2025
Skill-based Safe Reinforcement Learning with Risk Planning
Hanping Zhang
Yuhong Guo
OffRL
OnRL
47
0
0
02 May 2025
A General Approach of Automated Environment Design for Learning the Optimal Power Flow
Thomas Wolgast
Astrid Nieße
AI4CE
26
0
0
01 May 2025
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng
Weihao Tan
Zhiyi Lyu
Longtao Zheng
Haiyang Xu
Ming Yan
Fei Huang
Jingyi Wang
31
0
0
01 May 2025
Uncertainty-aware Latent Safety Filters for Avoiding Out-of-Distribution Failures
Junwon Seo
Kensuke Nakamura
Andrea V. Bajcsy
56
0
0
01 May 2025
Implicit Neural-Representation Learning for Elastic Deformable-Object Manipulations
Minseok Song
JeongHo Ha
Bonggyeong Park
Daehyung Park
207
0
0
01 May 2025
Fine-Tuning without Performance Degradation
Han Wang
Adam White
Martha White
OnRL
241
0
0
01 May 2025
Improvements of Dark Experience Replay and Reservoir Sampling towards Better Balance between Consolidation and Plasticity
Taisuke Kobayashi
CLL
43
0
0
29 Apr 2025
Dynamic Action Interpolation: A Universal Approach for Accelerating Reinforcement Learning with Expert Guidance
Wenjun Cao
52
0
0
26 Apr 2025
Learning from Less: SINDy Surrogates in RL
Aniket Dixit
Muhammad Ibrahim Khan
Faizan Ahmed
James Brusey
45
0
0
25 Apr 2025
Depth-Constrained ASV Navigation with Deep RL and Limited Sensing
Amirhossein Zhalehmehrabi
Daniele Meli
Francesco Dal Santo
Francesco Trotti
Alessandro Farinelli
29
0
0
25 Apr 2025
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
Zihan Wang
Kaidi Wang
Q. Wang
Pingyue Zhang
Linjie Li
...
Jiajun Wu
L. Fei-Fei
Lijuan Wang
Yejin Choi
Manling Li
92
7
0
24 Apr 2025
CaRL: Learning Scalable Planning Policies with Simple Rewards
Bernhard Jaeger
D. Dauner
Jens Beißwenger
Simon Gerstenecker
Kashyap Chitta
Andreas Geiger
60
1
0
24 Apr 2025
1
2
3
4
...
31
32
33
Next