ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1801.01290
  4. Cited By
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
v1v2 (latest)

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

4 January 2018
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
ArXiv (abs)PDFHTML

Papers citing "Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor"

50 / 4,130 papers shown
Title
Sample-Efficient Preference-based Reinforcement Learning with Dynamics
  Aware Rewards
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
Katherine Metcalf
Miguel Sarabia
Natalie Mackraz
B. Theobald
78
6
0
28 Feb 2024
Imitation-regularized Optimal Transport on Networks: Provable Robustness and Application to Logistics Planning
Imitation-regularized Optimal Transport on Networks: Provable Robustness and Application to Logistics Planning
Koshi Oishi
Yota Hashizume
Tomohiko Jimbo
Hirotaka Kaji
Kenji Kashima
OOD
95
2
0
28 Feb 2024
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement
  Learning
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
Michael T. Matthews
Michael Beukman
Benjamin Ellis
Mikayel Samvelyan
Matthew Jackson
Samuel Coward
Jakob Foerster
OffRL
114
31
0
26 Feb 2024
Think2Drive: Efficient Reinforcement Learning by Thinking in Latent
  World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)
Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)
Qifeng Li
Xiaosong Jia
Shaobo Wang
Junchi Yan
126
34
0
26 Feb 2024
Language-guided Skill Learning with Temporal Variational Inference
Language-guided Skill Learning with Temporal Variational Inference
Haotian Fu
Pratyusha Sharma
Elias Stengel-Eskin
George Konidaris
Nicolas Le Roux
Marc-Alexandre Côté
Xingdi Yuan
113
8
0
26 Feb 2024
Harnessing the Synergy between Pushing, Grasping, and Throwing to
  Enhance Object Manipulation in Cluttered Scenarios
Harnessing the Synergy between Pushing, Grasping, and Throwing to Enhance Object Manipulation in Cluttered Scenarios
Hamidreza Kasaei
Mohammadreza Kasaei
81
1
0
25 Feb 2024
Discretionary Lane-Change Decision and Control via Parameterized Soft
  Actor-Critic for Hybrid Action Space
Discretionary Lane-Change Decision and Control via Parameterized Soft Actor-Critic for Hybrid Action Space
Yuan Lin
Xiao Liu
Zishun Zheng
60
5
0
24 Feb 2024
A priori Estimates for Deep Residual Network in Continuous-time
  Reinforcement Learning
A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning
Shuyu Yin
Qixuan Zhou
Fei Wen
Tao Luo
83
0
0
24 Feb 2024
Is Offline Decision Making Possible with Only Few Samples? Reliable
  Decisions in Data-Starved Bandits via Trust Region Enhancement
Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Ruiqi Zhang
Yuexiang Zhai
Andrea Zanette
111
0
0
24 Feb 2024
Fair Resource Allocation in Multi-Task Learning
Fair Resource Allocation in Multi-Task Learning
Hao Ban
Kaiyi Ji
80
14
0
23 Feb 2024
Reinforcement Learning with Elastic Time Steps
Reinforcement Learning with Elastic Time Steps
Dong Wang
Giovanni Beltrame
102
2
0
22 Feb 2024
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy
  Regularization
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization
Tianying Ji
Yongyuan Liang
Yan Zeng
Yu-Juan Luo
Guowei Xu
Jiawei Guo
Ruijie Zheng
Furong Huang
Gang Hua
Huazhe Xu
CML
125
12
0
22 Feb 2024
Enhancing Robotic Manipulation with AI Feedback from Multimodal Large
  Language Models
Enhancing Robotic Manipulation with AI Feedback from Multimodal Large Language Models
Jinyi Liu
Yifu Yuan
Jianye Hao
Fei Ni
Lingzhi Fu
Yibin Chen
Yan Zheng
LM&Ro
410
6
0
22 Feb 2024
BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human
  Racing Gameplay
BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay
Catherine Weaver
Chen Tang
Ce Hao
Kenta Kawamoto
Masayoshi Tomizuka
Wei Zhan
OffRL
84
0
0
22 Feb 2024
Learning control strategy in soft robotics through a set of
  configuration spaces
Learning control strategy in soft robotics through a set of configuration spaces
Etienne Ménager
Christian Duriez
89
0
0
21 Feb 2024
The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning
The Edge-of-Reach Problem in Offline Model-Based Reinforcement Learning
Anya Sims
Cong Lu
Yee Whye Teh
OffRL
98
4
0
19 Feb 2024
In value-based deep reinforcement learning, a pruned network is a good
  network
In value-based deep reinforcement learning, a pruned network is a good network
J. Obando-Ceron
Rameswar Panda
Pablo Samuel Castro
OffRL
132
26
0
19 Feb 2024
Revisiting Data Augmentation in Deep Reinforcement Learning
Revisiting Data Augmentation in Deep Reinforcement Learning
Jianshu Hu
Yunpeng Jiang
Paul Weng
OffRL
94
6
0
19 Feb 2024
All Language Models Large and Small
All Language Models Large and Small
Zhixun Chen
Yali Du
D. Mguni
57
0
0
19 Feb 2024
Multi Task Inverse Reinforcement Learning for Common Sense Reward
Multi Task Inverse Reinforcement Learning for Common Sense Reward
Neta Glazer
Aviv Navon
Aviv Shamsian
Ethan Fetaya
85
0
0
17 Feb 2024
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
Xinyu Zhang
Wenjie Qiu
Yi-Chen Li
Lei Yuan
Chengxing Jia
Zongzhang Zhang
Yang Yu
OffRL
121
1
0
17 Feb 2024
Policy Learning for Off-Dynamics RL with Deficient Support
Policy Learning for Off-Dynamics RL with Deficient Support
Linh Le Pham Van
Hung The Tran
Sunil R. Gupta
84
2
0
16 Feb 2024
Revisiting Experience Replayable Conditions
Revisiting Experience Replayable Conditions
Taisuke Kobayashi
104
3
0
15 Feb 2024
Discrete Probabilistic Inference as Control in Multi-path Environments
Discrete Probabilistic Inference as Control in Multi-path Environments
T. Deleu
Padideh Nouri
Nikolay Malkin
Doina Precup
Yoshua Bengio
183
31
0
15 Feb 2024
Risk-Sensitive Soft Actor-Critic for Robust Deep Reinforcement Learning
  under Distribution Shifts
Risk-Sensitive Soft Actor-Critic for Robust Deep Reinforcement Learning under Distribution Shifts
Tobias Enders
James Harrison
Maximilian Schiffer
OOD
99
5
0
15 Feb 2024
Dataset Clustering for Improved Offline Policy Learning
Dataset Clustering for Improved Offline Policy Learning
Qiang Wang
Yixin Deng
Francisco Roldan Sanchez
Keru Wang
Kevin McGuinness
Noel E. O'Connor
Stephen J. Redmond
OffRL
89
2
0
14 Feb 2024
Entropy-regularized Point-based Value Iteration
Entropy-regularized Point-based Value Iteration
Harrison Delecki
Marcell Vazquez-Chanlatte
Esen Yel
K. H. Wray
Tomer Arnon
Stefan J. Witwicki
Mykel J. Kochenderfer
OOD
92
0
0
14 Feb 2024
Single-Reset Divide & Conquer Imitation Learning
Single-Reset Divide & Conquer Imitation Learning
Alexandre Chenu
Olivier Serris
Olivier Sigaud
Nicolas Perrin-Gilbert
69
0
0
14 Feb 2024
Hybrid Inverse Reinforcement Learning
Hybrid Inverse Reinforcement Learning
Juntao Ren
Gokul Swamy
Zhiwei Steven Wu
J. Andrew Bagnell
Sanjiban Choudhury
88
20
0
13 Feb 2024
Evaluation of a Smart Mobile Robotic System for Industrial Plant
  Inspection and Supervision
Evaluation of a Smart Mobile Robotic System for Industrial Plant Inspection and Supervision
Georg K.J. Fischer
M. Bergau
D. A. Gómez-Rosal
Andreas Wachaja
Johannes Grater
...
Nikhil Gosala
Niklas Wetzel
Daniel Buscher
Abhinav Valada
Wolfram Burgard
58
3
0
12 Feb 2024
SPO: Sequential Monte Carlo Policy Optimisation
SPO: Sequential Monte Carlo Policy Optimisation
Clément Bonnet
Edan Toledo
Donal Byrne
Paul Duckworth
Alexandre Laterre
80
1
0
12 Feb 2024
More Benefits of Being Distributional: Second-Order Bounds for
  Reinforcement Learning
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Kaiwen Wang
Owen Oertell
Alekh Agarwal
Nathan Kallus
Wen Sun
OffRL
128
12
0
11 Feb 2024
Deceptive Path Planning via Reinforcement Learning with Graph Neural
  Networks
Deceptive Path Planning via Reinforcement Learning with Graph Neural Networks
Michael Y. Fatemi
Wesley A Suttle
Brian M Sadler
OffRL
57
4
0
09 Feb 2024
Hierarchical Transformers are Efficient Meta-Reinforcement Learners
Hierarchical Transformers are Efficient Meta-Reinforcement Learners
Gresa Shala
André Biedenkapp
Josif Grabocka
OffRL
114
4
0
09 Feb 2024
Entropy-Regularized Token-Level Policy Optimization for Language Agent
  Reinforcement
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement
Muning Wen
Junwei Liao
Cheng Deng
Jun Wang
Weinan Zhang
Ying Wen
92
3
0
09 Feb 2024
Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains
Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains
Feiyang Wu
Xavier Nal
Ye Zhao
Anqi Wu
Zhaoyuan Gu
Anqi Wu
Ye Zhao
84
0
0
09 Feb 2024
Scaling Artificial Intelligence for Digital Wargaming in Support of
  Decision-Making
Scaling Artificial Intelligence for Digital Wargaming in Support of Decision-Making
Scotty Black
Christian J. Darken
32
2
0
08 Feb 2024
Limitations of Agents Simulated by Predictive Models
Limitations of Agents Simulated by Predictive Models
Raymond Douglas
Jacek Karwowski
Chan Bae
Andis Draguns
Victoria Krakovna
46
0
0
08 Feb 2024
DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning
DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning
Weikang Wan
Ziyu Wang
Yufei Wang
Zackory M. Erickson
David Held
143
4
0
08 Feb 2024
Three Pathways to Neurosymbolic Reinforcement Learning with
  Interpretable Model and Policy Networks
Three Pathways to Neurosymbolic Reinforcement Learning with Interpretable Model and Policy Networks
Peter Graf
Patrick Emami
61
2
0
07 Feb 2024
Do Transformer World Models Give Better Policy Gradients?
Do Transformer World Models Give Better Policy Gradients?
Michel Ma
Tianwei Ni
Clement Gehring
P. DÓro
Pierre-Luc Bacon
86
4
0
07 Feb 2024
Analyzing Adversarial Inputs in Deep Reinforcement Learning
Analyzing Adversarial Inputs in Deep Reinforcement Learning
Davide Corsi
Guy Amir
Guy Katz
Alessandro Farinelli
AAML
65
7
0
07 Feb 2024
QGFN: Controllable Greediness with Action Values
QGFN: Controllable Greediness with Action Values
Elaine Lau
Stephen Zhewen Lu
Ling Pan
Doina Precup
Emmanuel Bengio
178
14
0
07 Feb 2024
Learning Diverse Policies with Soft Self-Generated Guidance
Learning Diverse Policies with Soft Self-Generated Guidance
Guojian Wang
Faguo Wu
Xiao Zhang
Jianxiang Liu
OffRL
65
4
0
07 Feb 2024
Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning
Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning
Ruoqing Zhang
Ziwei Luo
Jens Sjölund
Thomas B. Schön
Per Mattsson
113
13
0
06 Feb 2024
Reinforcement Learning from Bagged Reward
Reinforcement Learning from Bagged Reward
Yuting Tang
Xin-Qiang Cai
Yao-Xiang Ding
Qiyu Wu
Guoqing Liu
Masashi Sugiyama
OffRL
88
0
0
06 Feb 2024
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model
  Feedback
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
Yufei Wang
Zhanyi Sun
Jesse Zhang
Zhou Xian
Erdem Biyik
David Held
Zackory M. Erickson
VLM
124
59
0
06 Feb 2024
Transductive Reward Inference on Graph
Transductive Reward Inference on Graph
B. Qu
Xiaofeng Cao
Qing Guo
Yi Chang
Ivor W. Tsang
Chengqi Zhang
OffRL
114
0
0
06 Feb 2024
A Multi-step Loss Function for Robust Learning of the Dynamics in
  Model-based Reinforcement Learning
A Multi-step Loss Function for Robust Learning of the Dynamics in Model-based Reinforcement Learning
Abdelhakim Benechehab
Albert Thomas
Giuseppe Paolo
Maurizio Filippone
Balázs Kégl
NoLa
61
1
0
05 Feb 2024
Boosting Reinforcement Learning with Strongly Delayed Feedback Through
  Auxiliary Short Delays
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
Qingyuan Wu
S. Zhan
Yixuan Wang
Yuhui Wang
Chung-Wei Lin
Chen Lv
Qi Zhu
Jürgen Schmidhuber
Chao Huang
OffRL
93
2
0
05 Feb 2024
Previous
123...212223...818283
Next