Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.02875
Cited By
Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions
5 December 2019
J. Schmidhuber
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reinforcement Learning Upside Down: Don't Predict Rewards -- Just Map Them to Actions"
23 / 23 papers shown
Title
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
Yu-Heng Hung
Kai-Jie Lin
Yu-Heng Lin
Chien-Yi Wang
Cheng Sun
Ping-Chun Hsieh
60
1
0
28 May 2025
Temporal Logic Specification-Conditioned Decision Transformer for Offline Safe Reinforcement Learning
Zijian Guo
Weichao Zhou
Wenchao Li
OffRL
138
2
0
28 Jan 2025
Upside Down Reinforcement Learning with Policy Generators
Jacopo Di Ventura
Dylan R. Ashley
Vincent Herrmann
Francesco Faccio
Jürgen Schmidhuber
69
0
0
27 Jan 2025
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
Thomas Schmied
Thomas Adler
Vihang Patil
M. Beck
Korbinian Poppel
Johannes Brandstetter
Günter Klambauer
Razvan Pascanu
Sepp Hochreiter
196
7
0
29 Oct 2024
Reinforcement learning
Florentin Wörgötter
82
2,533
0
16 May 2024
Return-Aligned Decision Transformer
Tsunehiko Tanaka
Kenshi Abe
Kaito Ariu
Tetsuro Morimura
Edgar Simo-Serra
OffRL
104
1
0
06 Feb 2024
An Invitation to Deep Reinforcement Learning
Bernhard Jaeger
Andreas Geiger
OffRL
OOD
122
5
0
13 Dec 2023
A Tractable Inference Perspective of Offline RL
Xuejie Liu
Hoang Trung-Dung
Guy Van den Broeck
Yitao Liang
OffRL
92
1
0
31 Oct 2023
Training Agents using Upside-Down Reinforcement Learning
R. Srivastava
Pranav Shyam
Filipe Wall Mutz
Wojciech Ja'skowski
Jürgen Schmidhuber
OffRL
68
126
0
05 Dec 2019
RUDDER: Return Decomposition for Delayed Rewards
Jose A. Arjona-Medina
Michael Gillhofer
Michael Widrich
Thomas Unterthiner
Johannes Brandstetter
Sepp Hochreiter
67
218
0
20 Jun 2018
One Big Net For Everything
Jürgen Schmidhuber
CLL
58
34
0
24 Feb 2018
Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control
Glen Berseth
Kevin Xie
Paul Cernek
M. van de Panne
69
56
0
13 Feb 2018
Hindsight policy gradients
Paulo E. Rauber
Avinash Ummadisingu
Filipe Wall Mutz
J. Schmidhuber
59
68
0
16 Nov 2017
Hindsight Experience Replay
Marcin Andrychowicz
Dwight Crow
Alex Ray
Jonas Schneider
Rachel Fong
Peter Welinder
Bob McGrew
Joshua Tobin
Pieter Abbeel
Wojciech Zaremba
OffRL
248
2,328
0
05 Jul 2017
Time-Contrastive Networks: Self-Supervised Learning from Video
P. Sermanet
Corey Lynch
Yevgen Chebotar
Jasmine Hsu
Eric Jang
S. Schaal
Sergey Levine
SSL
101
826
0
23 Apr 2017
One-Shot Imitation Learning
Yan Duan
Marcin Andrychowicz
Bradly C. Stadie
Jonathan Ho
Jonas Schneider
Ilya Sutskever
Pieter Abbeel
Wojciech Zaremba
OffRL
79
688
0
21 Mar 2017
Learning to Act by Predicting the Future
Alexey Dosovitskiy
V. Koltun
145
281
0
06 Nov 2016
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhiwen Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
897
6,790
0
26 Sep 2016
On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models
Jürgen Schmidhuber
64
104
0
30 Nov 2015
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
434
20,568
0
10 Sep 2014
Deep Learning in Neural Networks: An Overview
Jürgen Schmidhuber
HAI
246
16,361
0
30 Apr 2014
First Experiments with PowerPlay
R. Srivastava
Bas R. Steunebrink
Jürgen Schmidhuber
90
51
0
31 Oct 2012
POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
Jürgen Schmidhuber
97
149
0
22 Dec 2011
1