ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.00756
  4. Cited By
Multi-Advisor Reinforcement Learning
v1v2 (latest)

Multi-Advisor Reinforcement Learning

3 April 2017
Romain Laroche
Mehdi Fatemi
Joshua Romoff
H. V. Seijen
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Multi-Advisor Reinforcement Learning"

11 / 11 papers shown
Title
Successor Features for Efficient Multisubject Controlled Text Generation
Successor Features for Efficient Multisubject Controlled Text Generation
Mengyao Cao
Mehdi Fatemi
Jackie Chi Kit Cheung
Samira Shabanian
BDL
84
0
0
03 Nov 2023
Prioritized Soft Q-Decomposition for Lexicographic Reinforcement
  Learning
Prioritized Soft Q-Decomposition for Lexicographic Reinforcement Learning
Finn Rietz
Erik Schaffernicht
Stefan Heinrich
J. A. Stork
71
1
0
03 Oct 2023
Consistent Aggregation of Objectives with Diverse Time Preferences
  Requires Non-Markovian Rewards
Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards
Silviu Pitis
70
6
0
30 Sep 2023
On the Value of Myopic Behavior in Policy Reuse
On the Value of Myopic Behavior in Policy Reuse
Kang Xu
Chenjia Bai
Shuang Qiu
Haoran He
Bin Zhao
Zhen Wang
Wei Li
Xuelong Li
89
1
0
28 May 2023
Value Function Decomposition for Iterative Design of Reinforcement
  Learning Agents
Value Function Decomposition for Iterative Design of Reinforcement Learning Agents
J. MacGlashan
Evan Archer
A. Devlic
Takuma Seno
Craig Sherstan
Peter R. Wurman
AI PeterStoneSony
54
6
0
24 Jun 2022
Orchestrated Value Mapping for Reinforcement Learning
Orchestrated Value Mapping for Reinforcement Learning
Mehdi Fatemi
Arash Tavakoli
53
8
0
14 Mar 2022
Reward-Adaptive Reinforcement Learning: Dynamic Policy Gradient
  Optimization for Bipedal Locomotion
Reward-Adaptive Reinforcement Learning: Dynamic Policy Gradient Optimization for Bipedal Locomotion
Changxin Huang
Guangrun Wang
Zhibo Zhou
Ronghui Zhang
Liang Lin
74
20
0
05 Jul 2021
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms
Shangtong Zhang
Romain Laroche
H. V. Seijen
Shimon Whiteson
Rémi Tachet des Combes
120
15
0
02 Oct 2020
On mechanisms for transfer using landmark value functions in multi-task
  lifelong reinforcement learning
On mechanisms for transfer using landmark value functions in multi-task lifelong reinforcement learning
Nicholas Denis
OffRL
9
0
0
01 Jul 2019
A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
Francisco M. Garcia
Philip S. Thomas
102
41
0
03 Feb 2019
Danger-aware Adaptive Composition of DRL Agents for Self-navigation
Danger-aware Adaptive Composition of DRL Agents for Self-navigation
Wei Zhang
Yunfeng Zhang
Ning Liu
33
9
0
11 Sep 2018
1