
v1v2 (latest)
Learning Long-Term Reward Redistribution via Randomized Return Decomposition
Papers citing "Learning Long-Term Reward Redistribution via Randomized Return Decomposition"
34 / 34 papers shown
Title |
---|
![]() Soft Actor-Critic Algorithms and Applications Tuomas Haarnoja Aurick Zhou Kristian Hartikainen George Tucker Sehoon Ha ...Vikash Kumar Henry Zhu Abhishek Gupta Pieter Abbeel Sergey Levine |