ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.06387
  4. Cited By
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement
  Learning from Observations

Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations

12 April 2019
Daniel S. Brown
Wonjoon Goo
P. Nagarajan
S. Niekum
ArXivPDFHTML

Papers citing "Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations"

50 / 95 papers shown
Title
Reinforcement Learning from Multi-level and Episodic Human Feedback
Reinforcement Learning from Multi-level and Episodic Human Feedback
Muhammad Qasim Elahi
Somtochukwu Oguchienti
Maheed H. Ahmed
Mahsa Ghasemi
OffRL
55
0
0
20 Apr 2025
Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models
Uncertainty Comes for Free: Human-in-the-Loop Policies with Diffusion Models
Zhanpeng He
Yifeng Cao
M. Ciocarlie
80
0
0
26 Feb 2025
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
Inverse-RLignment: Large Language Model Alignment from Demonstrations through Inverse Reinforcement Learning
Hao Sun
M. Schaar
94
14
0
28 Jan 2025
Contrastive Learning from Exploratory Actions: Leveraging Natural Interactions for Preference Elicitation
N. Dennler
Stefanos Nikolaidis
Maja J. Matarić
227
0
0
03 Jan 2025
Incremental Learning for Robot Shared Autonomy
Incremental Learning for Robot Shared Autonomy
Yiran Tao
Guixiu Qiao
Dan Ding
Zackory Erickson
CLL
40
0
0
08 Oct 2024
Control-oriented Clustering of Visual Latent Representation
Control-oriented Clustering of Visual Latent Representation
Han Qi
Haocheng Yin
Heng Yang
SSL
63
2
0
07 Oct 2024
Online Control-Informed Learning
Online Control-Informed Learning
Zihao Liang
Tianyu Zhou
Zehui Lu
Shaoshuai Mou
38
1
0
04 Oct 2024
Robust Offline Imitation Learning from Diverse Auxiliary Data
Robust Offline Imitation Learning from Diverse Auxiliary Data
Udita Ghosh
Dripta S. Raychaudhuri
Jiachen Li
Konstantinos Karydis
Amit K. Roy-Chowdhury
OffRL
29
1
0
04 Oct 2024
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Bilevel reinforcement learning via the development of hyper-gradient without lower-level convexity
Yan Yang
Bin Gao
Ya-xiang Yuan
50
2
0
30 May 2024
A Unified Linear Programming Framework for Offline Reward Learning from
  Human Demonstrations and Feedback
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Kihyun Kim
Jiawei Zhang
Asuman Ozdaglar
P. Parrilo
OffRL
46
1
0
20 May 2024
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning
Calarina Muslimani
Matthew E. Taylor
OffRL
46
2
0
30 Apr 2024
A Generalized Acquisition Function for Preference-based Reward Learning
A Generalized Acquisition Function for Preference-based Reward Learning
Evan Ellis
Gaurav R. Ghosal
Stuart J. Russell
Anca Dragan
Erdem Biyik
42
2
0
09 Mar 2024
Bayesian Constraint Inference from User Demonstrations Based on
  Margin-Respecting Preference Models
Bayesian Constraint Inference from User Demonstrations Based on Margin-Respecting Preference Models
Dimitris Papadimitriou
Daniel S. Brown
53
1
0
04 Mar 2024
A Model-Based Approach for Improving Reinforcement Learning Efficiency
  Leveraging Expert Observations
A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations
E. C. Ozcan
Vittorio Giammarino
James Queeney
I. Paschalidis
OffRL
44
0
0
29 Feb 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional
  Preference Alignment with Multi-Objective Rewards
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
Haoxiang Wang
Yong Lin
Wei Xiong
Rui Yang
Shizhe Diao
Shuang Qiu
Han Zhao
Tong Zhang
40
72
0
28 Feb 2024
SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning
SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning
Huy Hoang
Tien Mai
Pradeep Varakantham
OffRL
47
2
0
20 Feb 2024
Iterative Data Smoothing: Mitigating Reward Overfitting and
  Overoptimization in RLHF
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Banghua Zhu
Michael I. Jordan
Jiantao Jiao
36
25
0
29 Jan 2024
Crowd-PrefRL: Preference-Based Reward Learning from Crowds
Crowd-PrefRL: Preference-Based Reward Learning from Crowds
David Chhan
Ellen R. Novoseller
Vernon J. Lawhern
40
5
0
17 Jan 2024
Aligning Human Intent from Imperfect Demonstrations with
  Confidence-based Inverse soft-Q Learning
Aligning Human Intent from Imperfect Demonstrations with Confidence-based Inverse soft-Q Learning
Xizhou Bu
Wenjuan Li
Zhengxiong Liu
Zhiqiang Ma
Panfeng Huang
22
1
0
18 Dec 2023
A density estimation perspective on learning from pairwise human
  preferences
A density estimation perspective on learning from pairwise human preferences
Vincent Dumoulin
Daniel D. Johnson
Pablo Samuel Castro
Hugo Larochelle
Yann Dauphin
37
12
0
23 Nov 2023
Inverse Decision Modeling: Learning Interpretable Representations of
  Behavior
Inverse Decision Modeling: Learning Interpretable Representations of Behavior
Daniel Jarrett
Alihan Huyuk
M. Schaar
AI4CE
22
27
0
28 Oct 2023
Learning to Discern: Imitating Heterogeneous Human Demonstrations with
  Preference and Representation Learning
Learning to Discern: Imitating Heterogeneous Human Demonstrations with Preference and Representation Learning
Sachit Kuhar
Shuo Cheng
Shivang Chopra
Matthew Bronars
Danfei Xu
55
9
0
22 Oct 2023
Learning Reward for Physical Skills using Large Language Model
Learning Reward for Physical Skills using Large Language Model
Yuwei Zeng
Yiqing Xu
36
6
0
21 Oct 2023
Distance-rank Aware Sequential Reward Learning for Inverse Reinforcement
  Learning with Sub-optimal Demonstrations
Distance-rank Aware Sequential Reward Learning for Inverse Reinforcement Learning with Sub-optimal Demonstrations
Lu Li
Yuxin Pan
Ruobing Chen
Jie Liu
Zilin Wang
Yu Liu
Zhiheng Li
50
0
0
13 Oct 2023
Reinforcement Learning in the Era of LLMs: What is Essential? What is
  needed? An RL Perspective on RLHF, Prompting, and Beyond
Reinforcement Learning in the Era of LLMs: What is Essential? What is needed? An RL Perspective on RLHF, Prompting, and Beyond
Hao Sun
OffRL
34
21
0
09 Oct 2023
Rating-based Reinforcement Learning
Rating-based Reinforcement Learning
Devin White
Mingkang Wu
Ellen R. Novoseller
Vernon J. Lawhern
Nicholas R. Waytowich
Yongcan Cao
ALM
19
6
0
30 Jul 2023
On the Expressivity of Multidimensional Markov Reward
On the Expressivity of Multidimensional Markov Reward
Shuwa Miura
24
4
0
22 Jul 2023
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Preference-grounded Token-level Guidance for Language Model Fine-tuning
Shentao Yang
Shujian Zhang
Congying Xia
Yihao Feng
Caiming Xiong
Mi Zhou
29
23
0
01 Jun 2023
Programmatic Imitation Learning from Unlabeled and Noisy Demonstrations
Programmatic Imitation Learning from Unlabeled and Noisy Demonstrations
Jimmy Xin
Linus Zheng
Kia Rahmani
Jiayi Wei
Jarrett Holtz
Işıl Dillig
Joydeep Biswas
30
1
0
02 Mar 2023
MEGA-DAgger: Imitation Learning with Multiple Imperfect Experts
MEGA-DAgger: Imitation Learning with Multiple Imperfect Experts
Xiatao Sun
Shuo Yang
Mingyan Zhou
Kunpeng Liu
Rahul Mangharam
OffRL
15
13
0
01 Mar 2023
Active Reward Learning from Online Preferences
Active Reward Learning from Online Preferences
Vivek Myers
Erdem Biyik
Dorsa Sadigh
OffRL
37
12
0
27 Feb 2023
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
Yunke Wang
Bo Du
Chang Xu
38
8
0
13 Feb 2023
Theoretical Analysis of Offline Imitation With Supplementary Dataset
Theoretical Analysis of Offline Imitation With Supplementary Dataset
Ziniu Li
Tian Xu
Y. Yu
Zhixun Luo
OffRL
38
2
0
27 Jan 2023
Principled Reinforcement Learning with Human Feedback from Pairwise or
  $K$-wise Comparisons
Principled Reinforcement Learning with Human Feedback from Pairwise or KKK-wise Comparisons
Banghua Zhu
Jiantao Jiao
Michael I. Jordan
OffRL
42
184
0
26 Jan 2023
On The Fragility of Learned Reward Functions
On The Fragility of Learned Reward Functions
Lev McKinney
Yawen Duan
David M. Krueger
Adam Gleave
33
20
0
09 Jan 2023
Benchmarks and Algorithms for Offline Preference-Based Reward Learning
Benchmarks and Algorithms for Offline Preference-Based Reward Learning
Daniel Shin
Anca Dragan
Daniel S. Brown
OffRL
17
53
0
03 Jan 2023
Genetic Imitation Learning by Reward Extrapolation
Genetic Imitation Learning by Reward Extrapolation
Boyuan Zheng
Jianlong Zhou
Fang Chen
19
0
0
03 Jan 2023
Explaining Imitation Learning through Frames
Explaining Imitation Learning through Frames
Boyuan Zheng
Jianlong Zhou
Chun-Hao Liu
Yiqiao Li
Fang Chen
14
0
0
03 Jan 2023
SIRL: Similarity-based Implicit Representation Learning
SIRL: Similarity-based Implicit Representation Learning
Andreea Bobu
Yi Liu
Rohin Shah
Daniel S. Brown
Anca Dragan
SSL
DRL
40
17
0
02 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from
  Text Edits
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
101
35
0
01 Jan 2023
Few-Shot Preference Learning for Human-in-the-Loop RL
Few-Shot Preference Learning for Human-in-the-Loop RL
Joey Hejna
Dorsa Sadigh
OffRL
32
92
0
06 Dec 2022
Reinforcement learning with Demonstrations from Mismatched Task under
  Sparse Reward
Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward
Yanjiang Guo
Jingyue Gao
Zheng Wu
Chengming Shi
Jianyu Chen
OffRL
26
4
0
03 Dec 2022
Embedding Synthetic Off-Policy Experience for Autonomous Driving via
  Zero-Shot Curricula
Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula
Eli Bronstein
S. Srinivasan
Supratik Paul
Aman Sinha
Matthew O'Kelly
Payam Nikdel
Shimon Whiteson
OffRL
8
18
0
02 Dec 2022
Time-Efficient Reward Learning via Visually Assisted Cluster Ranking
Time-Efficient Reward Learning via Visually Assisted Cluster Ranking
David Zhang
Micah Carroll
Andreea Bobu
Anca Dragan
29
4
0
30 Nov 2022
Understanding Acoustic Patterns of Human Teachers Demonstrating
  Manipulation Tasks to Robots
Understanding Acoustic Patterns of Human Teachers Demonstrating Manipulation Tasks to Robots
Akanksha Saran
K. Desai
M. L. Chang
Rudolf Lioutikov
A. Thomaz
S. Niekum
25
3
0
01 Nov 2022
D-Shape: Demonstration-Shaped Reinforcement Learning via Goal
  Conditioning
D-Shape: Demonstration-Shaped Reinforcement Learning via Goal Conditioning
Caroline Wang
Garrett A. Warnell
Peter Stone
45
3
0
26 Oct 2022
Robust Offline Reinforcement Learning with Gradient Penalty and
  Constraint Relaxation
Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation
Chengqian Gao
Kelvin Xu
Liu Liu
Deheng Ye
P. Zhao
Zhiqiang Xu
OffRL
45
2
0
19 Oct 2022
Extraneousness-Aware Imitation Learning
Extraneousness-Aware Imitation Learning
Rachel Zheng
Kaizhe Hu
Zhecheng Yuan
Boyuan Chen
Huazhe Xu
SSL
33
3
0
04 Oct 2022
Bayesian Q-learning With Imperfect Expert Demonstrations
Bayesian Q-learning With Imperfect Expert Demonstrations
Fengdi Che
Xiru Zhu
Doina Precup
David Meger
Gregory Dudek
19
2
0
01 Oct 2022
Calculus on MDPs: Potential Shaping as a Gradient
Calculus on MDPs: Potential Shaping as a Gradient
Erik Jenner
H. V. Hoof
Adam Gleave
22
4
0
20 Aug 2022
12
Next