ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.10369
  4. Cited By
Reinforcement learning

Reinforcement learning

16 May 2024
Florentin Wörgötter
ArXivPDFHTML

Papers citing "Reinforcement learning"

39 / 89 papers shown
Title
Minimum Risk Training for Neural Machine Translation
Minimum Risk Training for Neural Machine Translation
Shiqi Shen
Yong Cheng
Zhongjun He
W. He
Hua Wu
Maosong Sun
Yang Liu
90
469
0
08 Dec 2015
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Yinlam Chow
Mohammad Ghavamzadeh
Lucas Janson
Marco Pavone
55
510
0
05 Dec 2015
Sequence Level Training with Recurrent Neural Networks
Sequence Level Training with Recurrent Neural Networks
MarcÁurelio Ranzato
S. Chopra
Michael Auli
Wojciech Zaremba
70
1,611
0
20 Nov 2015
Prioritized Experience Replay
Prioritized Experience Replay
Tom Schaul
John Quan
Ioannis Antonoglou
David Silver
OffRL
198
3,781
0
18 Nov 2015
Deep Reinforcement Learning with Double Q-learning
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
131
7,590
0
22 Sep 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
191
13,174
0
09 Sep 2015
Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk
  Measures
Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures
Daniel R. Jiang
Warrren B Powell
20
37
0
07 Sep 2015
Maximum Entropy Deep Inverse Reinforcement Learning
Maximum Entropy Deep Inverse Reinforcement Learning
Markus Wulfmeier
Peter Ondruska
Ingmar Posner
OOD
60
404
0
17 Jul 2015
Safe Policy Search for Lifelong Reinforcement Learning with Sublinear
  Regret
Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret
Haitham Bou-Ammar
Rasul Tutunov
Eric Eaton
OffRL
CLL
48
64
0
21 May 2015
End-to-End Training of Deep Visuomotor Policies
End-to-End Training of Deep Visuomotor Policies
Sergey Levine
Chelsea Finn
Trevor Darrell
Pieter Abbeel
BDL
222
3,418
0
02 Apr 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
239
6,722
0
19 Feb 2015
Policy Gradient for Coherent Risk Measures
Policy Gradient for Coherent Risk Measures
Aviv Tamar
Yinlam Chow
Mohammad Ghavamzadeh
Shie Mannor
38
117
0
13 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
815
149,474
0
22 Dec 2014
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
Ilya Sutskever
Oriol Vinyals
Quoc V. Le
AIMat
280
20,491
0
10 Sep 2014
Algorithms for CVaR Optimization in MDPs
Algorithms for CVaR Optimization in MDPs
Yinlam Chow
Mohammad Ghavamzadeh
60
198
0
12 Jun 2014
Optimizing the CVaR via Sampling
Optimizing the CVaR via Sampling
Aviv Tamar
Yonatan Glassner
Shie Mannor
48
186
0
15 Apr 2014
Variance-Constrained Actor-Critic Algorithms for Discounted and Average
  Reward MDPs
Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs
Prashanth L.A.
Mohammad Ghavamzadeh
37
69
0
25 Mar 2014
A Survey of Multi-Objective Sequential Decision-Making
A Survey of Multi-Objective Sequential Decision-Making
D. Roijers
Peter Vamplew
Shimon Whiteson
Richard Dazeley
61
648
0
04 Feb 2014
Kalman Temporal Differences
Kalman Temporal Differences
Matthieu Geist
Olivier Pietquin
47
101
0
16 Jan 2014
Interactive Policy Learning through Confidence-Based Autonomy
Interactive Policy Learning through Confidence-Based Autonomy
Sonia Chernova
Manuela Veloso
124
279
0
15 Jan 2014
Black Box Variational Inference
Black Box Variational Inference
Rajesh Ranganath
S. Gerrish
David M. Blei
DRL
BDL
68
1,157
0
31 Dec 2013
Risk-sensitive Reinforcement Learning
Risk-sensitive Reinforcement Learning
Yun Shen
Michael J. Tobia
T. Sommer
Klaus Obermayer
52
319
0
08 Nov 2013
PEGASUS: A Policy Search Method for Large MDPs and POMDPs
PEGASUS: A Policy Search Method for Large MDPs and POMDPs
A. Ng
Michael I. Jordan
55
496
0
16 Jan 2013
Solving Factored MDPs with Continuous and Discrete Variables
Solving Factored MDPs with Continuous and Discrete Variables
Carlos Guestrin
Milos Hauskrecht
Branislav Kveton
67
76
0
11 Jul 2012
Policy Gradients with Variance Related Risk Criteria
Policy Gradients with Variance Related Risk Criteria
Dotan Di Castro
Aviv Tamar
Shie Mannor
63
206
0
27 Jun 2012
Apprenticeship Learning using Inverse Reinforcement Learning and
  Gradient Methods
Apprenticeship Learning using Inverse Reinforcement Learning and Gradient Methods
Gergely Neu
Csaba Szepesvári
47
244
0
20 Jun 2012
Exploring compact reinforcement-learning representations with linear
  regression
Exploring compact reinforcement-learning representations with linear regression
Thomas J. Walsh
I. Szita
Carlos Diuk
Michael L. Littman
OffRL
174
114
0
09 May 2012
PAC-Bayesian Inequalities for Martingales
PAC-Bayesian Inequalities for Martingales
Yevgeny Seldin
François Laviolette
Nicolò Cesa-Bianchi
John Shawe-Taylor
P. Auer
57
127
0
31 Oct 2011
Learning Symbolic Models of Stochastic Domains
Learning Symbolic Models of Stochastic Domains
L. Kaelbling
H. Pasula
L. S. Zettlemoyer
86
240
0
10 Oct 2011
Risk-Sensitive Reinforcement Learning Applied to Control under
  Constraints
Risk-Sensitive Reinforcement Learning Applied to Control under Constraints
Peter Geibel
F. Wysotzki
73
317
0
09 Sep 2011
Learning to Coordinate Efficiently: A Model-based Approach
Learning to Coordinate Efficiently: A Model-based Approach
Ronen I. Brafman
Moshe Tennenholtz
35
43
0
26 Jun 2011
Bayesian multitask inverse reinforcement learning
Bayesian multitask inverse reinforcement learning
Christos Dimitrakakis
Constantin Rothkopf
BDL
53
106
0
18 Jun 2011
Efficient Solution Algorithms for Factored MDPs
Efficient Solution Algorithms for Factored MDPs
Carlos Guestrin
D. Koller
Ronald E. Parr
Shobha Venkataraman
84
537
0
09 Jun 2011
Optimal Reinforcement Learning for Gaussian Systems
Optimal Reinforcement Learning for Gaussian Systems
Philipp Hennig
61
19
0
04 Jun 2011
Experiments with Infinite-Horizon, Policy-Gradient Estimation
Experiments with Infinite-Horizon, Policy-Gradient Estimation
Jonathan Baxter
Peter L. Bartlett
Lex Weaver
53
171
0
03 Jun 2011
Infinite-Horizon Policy-Gradient Estimation
Infinite-Horizon Policy-Gradient Estimation
Jonathan Baxter
Peter L. Bartlett
70
808
0
03 Jun 2011
Decision-Theoretic Planning: Structural Assumptions and Computational
  Leverage
Decision-Theoretic Planning: Structural Assumptions and Computational Leverage
Craig Boutilier
T. Dean
S. Hanks
87
1,311
0
27 May 2011
A Contextual-Bandit Approach to Personalized News Article Recommendation
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
267
2,935
0
28 Feb 2010
Learning from Logged Implicit Exploration Data
Learning from Logged Implicit Exploration Data
Alexander L. Strehl
John Langford
Sham Kakade
Lihong Li
OffRL
104
254
0
27 Feb 2010
Previous
12