Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.01946
Cited By
Learning to Understand Goal Specifications by Modelling Reward
5 June 2018
Dzmitry Bahdanau
Felix Hill
Jan Leike
Edward Hughes
Seyedarian Hosseini
Pushmeet Kohli
Edward Grefenstette
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to Understand Goal Specifications by Modelling Reward"
38 / 38 papers shown
Title
IPCGRL: Language-Instructed Reinforcement Learning for Procedural Level Generation
In-Chang Baek
Sung-Hyun Kim
Seo-Young Lee
Dong-Hyeun Kim
Kyung-Joong Kim
56
0
0
16 Mar 2025
Few-Shot Task Learning through Inverse Generative Modeling
Aviv Netanyahu
Yilun Du
Antonia Bronars
Jyothish Pari
J. Tenenbaum
Tianmin Shu
Pulkit Agrawal
51
1
0
07 Nov 2024
Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning
Beyazit Yalcinkaya
Niklas Lauffer
Marcell Vazquez-Chanlatte
S. Seshia
AI4CE
52
5
0
31 Oct 2024
LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
Utsav Singh
Pramit Bhattacharyya
Vinay P. Namboodiri
LM&Ro
47
1
0
09 Jun 2024
Improve the efficiency of deep reinforcement learning through semantic exploration guided by natural language
Zhourui Guo
Meng Yao
Yang Yu
Qiyue Yin
OnRL
28
1
0
21 Sep 2023
Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation
Bokui (William) Shen
Ge Yang
Alan Yu
J. Wong
L. Kaelbling
Phillip Isola
VLM
29
104
0
27 Jul 2023
Designing Fiduciary Artificial Intelligence
Sebastian Benthall
David Shekman
51
4
0
27 Jul 2023
Reward Collapse in Aligning Large Language Models
Ziang Song
Tianle Cai
Jason D. Lee
Weijie J. Su
ALM
26
22
0
28 May 2023
Conceptual Reinforcement Learning for Language-Conditioned Tasks
Shaohui Peng
Xingui Hu
Rui Zhang
Jiaming Guo
Qi Yi
Rui Chen
Zidong Du
Ling Li
Qi Guo
Yunji Chen
OffRL
35
8
0
09 Mar 2023
PIRLNav: Pretraining with Imitation and RL Finetuning for ObjectNav
Ram Ramrakhya
Dhruv Batra
Erik Wijmans
Abhishek Das
OffRL
20
53
0
18 Jan 2023
A Computational Interface to Translate Strategic Intent from Unstructured Language in a Low-Data Setting
Pradyumna Tambwekar
Lakshita Dodeja
Nathan Vaska
Wei Xu
Matthew C. Gombolay
33
0
0
17 Aug 2022
Leveraging Language for Accelerated Learning of Tool Manipulation
Allen Z. Ren
Bharat Govil
Tsung-Yen Yang
Karthik Narasimhan
Anirudha Majumdar
LM&Ro
22
37
0
27 Jun 2022
How to talk so AI will learn: Instructions, descriptions, and autonomy
T. Sumers
Robert D. Hawkins
Mark K. Ho
Thomas L. Griffiths
Dylan Hadfield-Menell
LM&Ro
32
20
0
16 Jun 2022
Inferring Rewards from Language in Context
Jessy Lin
Daniel Fried
Dan Klein
Anca Dragan
LM&Ro
24
54
0
05 Apr 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,953
0
04 Mar 2022
Learning Invariable Semantical Representation from Language for Extensible Policy Generalization
Yihan Li
Jinsheng Ren
Tianrun Xu
Tianren Zhang
Haichuan Gao
Feng Chen
16
1
0
26 Jan 2022
Safe Deep RL in 3D Environments using Human Feedback
Matthew Rahtz
Vikrant Varma
Ramana Kumar
Zachary Kenton
Shane Legg
Jan Leike
29
4
0
20 Jan 2022
Learning to Guide and to Be Guided in the Architect-Builder Problem
Paul Barde
Tristan Karch
Derek Nowrouzezahrai
Clément Moulin-Frier
C. Pal
Pierre-Yves Oudeyer
41
4
0
14 Dec 2021
LILA: Language-Informed Latent Actions
Siddharth Karamcheti
Megha Srivastava
Percy Liang
Dorsa Sadigh
LM&Ro
27
31
0
05 Nov 2021
Feudal Reinforcement Learning by Reading Manuals
Kai Wang
Zhonghao Wang
Mo Yu
Humphrey Shi
OffRL
30
0
0
13 Oct 2021
Interactive Hierarchical Guidance using Language
Bharat Prakash
Nicholas R. Waytowich
Tim Oates
T. Mohsenin
LM&Ro
14
9
0
09 Oct 2021
Generalization in Text-based Games via Hierarchical Reinforcement Learning
Yunqiu Xu
Meng Fang
Ling Chen
Yali Du
Chengqi Zhang
AI4CE
40
20
0
21 Sep 2021
Hindsight Reward Tweaking via Conditional Deep Reinforcement Learning
Ning Wei
Jiahua Liang
Di Xie
Shiliang Pu
17
0
0
06 Sep 2021
Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation
Suraj Nair
E. Mitchell
Kevin Chen
Brian Ichter
Silvio Savarese
Chelsea Finn
LM&Ro
OffRL
37
154
0
02 Sep 2021
A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution
Valts Blukis
Chris Paxton
D. Fox
Animesh Garg
Yoav Artzi
LM&Ro
212
134
0
12 Jul 2021
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents
Grgur Kovač
Rémy Portelas
Katja Hofmann
Pierre-Yves Oudeyer
ALM
27
6
0
02 Jul 2021
Grounding Spatio-Temporal Language with Transformers
Tristan Karch
Laetitia Teodorescu
Katja Hofmann
Clément Moulin-Frier
Pierre-Yves Oudeyer
LM&Ro
16
11
0
16 Jun 2021
Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation
Julia Ive
A. Li
Yishu Miao
Ozan Caglayan
Pranava Madhyastha
Lucia Specia
26
10
0
22 Feb 2021
Open Problems in Cooperative AI
Allan Dafoe
Edward Hughes
Yoram Bachrach
Tantum Collins
Kevin R. McKee
Joel Z Leibo
Kate Larson
T. Graepel
26
199
0
15 Dec 2020
Connecting Context-specific Adaptation in Humans to Meta-learning
Rachit Dubey
Erin Grant
Michael Luo
Karthik Narasimhan
Thomas L. Griffiths
24
4
0
27 Nov 2020
Learning Rewards from Linguistic Feedback
T. Sumers
Mark K. Ho
Robert D. Hawkins
Karthik Narasimhan
Thomas L. Griffiths
21
51
0
30 Sep 2020
Learning with AMIGo: Adversarially Motivated Intrinsic Goals
Andres Campero
Roberta Raileanu
Heinrich Küttler
J. Tenenbaum
Tim Rocktaschel
Edward Grefenstette
32
125
0
22 Jun 2020
Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text
Felix Hill
Soňa Mokrá
Nathaniel Wong
Tim Harley
LM&Ro
19
81
0
19 May 2020
Using Natural Language for Reward Shaping in Reinforcement Learning
Prasoon Goyal
S. Niekum
Raymond J. Mooney
LM&Ro
38
175
0
05 Mar 2019
Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning
Aishwarya Agrawal
Mateusz Malinowski
Felix Hill
S. M. Ali Eslami
Oriol Vinyals
Tejas D. Kulkarni
21
4
0
03 Dec 2018
Unsupervised Control Through Non-Parametric Discriminative Rewards
David Warde-Farley
T. Wiele
Tejas D. Kulkarni
Catalin Ionescu
S. Hansen
Volodymyr Mnih
DRL
OffRL
SSL
33
172
0
28 Nov 2018
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
28
395
0
19 Nov 2018
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
Maxime Chevalier-Boisvert
Dzmitry Bahdanau
Salem Lahlou
Lucas Willems
Chitwan Saharia
Thien Huu Nguyen
Yoshua Bengio
ELM
16
230
0
18 Oct 2018
1