Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.00031
Cited By
v1
v2
v3
v4 (latest)
Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges
31 July 2023
Giorgio Franceschelli
Mirco Musolesi
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges"
50 / 65 papers shown
Title
Neural Collage Transfer: Artistic Reconstruction via Material Manipulation
Ganghun Lee
Minji Kim
Yunsu Lee
M. Lee
Byoung-Tak Zhang
DiffM
43
1
0
03 Nov 2023
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
149
13
0
28 Aug 2023
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models
Ashutosh Baheti
Ximing Lu
Faeze Brahman
Ronan Le Bras
Maarten Sap
Mark O. Riedl
70
9
0
24 May 2023
Toxicity in ChatGPT: Analyzing Persona-assigned Language Models
Ameet Deshpande
Vishvak Murahari
Tanmay Rajpurohit
Ashwin Kalyan
Karthik Narasimhan
LM&MA
LLMAG
67
365
0
11 Apr 2023
On the Creativity of Large Language Models
Giorgio Franceschelli
Mirco Musolesi
168
56
0
27 Mar 2023
Execution-based Code Generation using Deep Reinforcement Learning
Parshin Shojaee
Aneesh Jain
Sindhu Tipirneni
Chandan K. Reddy
76
58
0
31 Jan 2023
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
199
1,634
0
15 Dec 2022
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering
Jiacheng Liu
Skyler Hallinan
Ximing Lu
Pengfei He
Sean Welleck
Hannaneh Hajishirzi
Yejin Choi
RALM
83
60
0
06 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
95
247
0
03 Oct 2022
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
142
216
0
26 May 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLM
DiffM
404
6,866
0
13 Apr 2022
Generative Cooperative Networks for Natural Language Generation
Sylvain Lamprier
Thomas Scialom
Antoine Chaffin
Vincent Claveau
Ewa Kijak
Jacopo Staiano
Benjamin Piwowarski
GAN
90
13
0
28 Jan 2022
LaMDA: Language Models for Dialog Applications
R. Thoppilan
Daniel De Freitas
Jamie Hall
Noam M. Shazeer
Apoorv Kulshreshtha
...
Blaise Aguera-Arcas
Claire Cui
M. Croak
Ed H. Chi
Quoc Le
ALM
137
1,595
0
20 Jan 2022
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
165
303
0
22 Sep 2021
Design Guidelines for Prompt Engineering Text-to-Image Generative Models
Vivian Liu
Lydia B. Chilton
65
492
0
14 Sep 2021
Revisiting the Weaknesses of Reinforcement Learning for Neural Machine Translation
Samuel Kiegeland
Julia Kreutzer
AAML
70
46
0
16 Jun 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
415
4,953
0
24 Feb 2021
Curiosity in exploring chemical space: Intrinsic rewards for deep molecular reinforcement learning
Luca Thiede
Mario Krenn
AkshatKumar Nigam
Alán Aspuru-Guzik
50
31
0
17 Dec 2020
Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level Paintings
Jaskirat Singh
Liang Zheng
55
20
0
25 Nov 2020
Human-centric Dialog Training via Offline Reinforcement Learning
Natasha Jaques
J. Shen
Asma Ghandeharioun
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
77
95
0
12 Oct 2020
ColdGANs: Taming Language GANs with Cautious Sampling Strategies
Thomas Scialom
Paul-Alexis Dray
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
GAN
SyDa
52
18
0
08 Jun 2020
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping
Jesse Dodge
Gabriel Ilharco
Roy Schwartz
Ali Farhadi
Hannaneh Hajishirzi
Noah A. Smith
99
597
0
15 Feb 2020
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning
Nan Jiang
Sheng Jin
Z. Duan
Changshui Zhang
OffRL
86
50
0
08 Feb 2020
Graph Constrained Reinforcement Learning for Natural Language Action Spaces
Prithviraj Ammanabrolu
Matthew J. Hausknecht
AI4CE
LLMAG
69
129
0
23 Jan 2020
Dream to Control: Learning Behaviors by Latent Imagination
Danijar Hafner
Timothy Lillicrap
Jimmy Ba
Mohammad Norouzi
VLM
124
1,356
0
03 Dec 2019
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Kai Zhang
Zhuoran Yang
Tamer Basar
196
1,218
0
24 Nov 2019
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
466
1,734
0
18 Sep 2019
A survey on intrinsic motivation in reinforcement learning
A. Aubret
L. Matignon
S. Hassas
AI4CE
85
144
0
19 Aug 2019
Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation
Yang Gao
Christian M. Meyer
Mohsen Mesgar
Iryna Gurevych
83
23
0
30 Jul 2019
On the Weaknesses of Reinforcement Learning for Neural Machine Translation
Leshem Choshen
Lior Fox
Zohar Aizenbud
Omri Abend
110
108
0
03 Jul 2019
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
326
5,845
0
21 Apr 2019
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
136
2,445
0
13 Dec 2018
Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning
Ziming Li
Julia Kiseleva
Maarten de Rijke
50
51
0
09 Dec 2018
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
98
420
0
19 Nov 2018
Controllable Neural Story Plot Generation via Reward Shaping
Pradyumna Tambwekar
Murtaza Dhuliawala
Lara J. Martin
Animesh Mehta
Brent Harrison
Mark O. Riedl
77
88
0
27 Sep 2018
Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation
Jiaxuan You
Bowen Liu
Rex Ying
Vijay S. Pande
J. Leskovec
GNN
293
902
0
07 Jun 2018
Toward Diverse Text Generation with Inverse Reinforcement Learning
Zhan Shi
Xinchi Chen
Xipeng Qiu
Xuanjing Huang
53
104
0
30 Apr 2018
A Call for Clarity in Reporting BLEU Scores
Matt Post
160
2,994
0
23 Apr 2018
Multi-Reward Reinforced Summarization with Saliency and Entailment
Ramakanth Pasunuru
Joey Tianyi Zhou
56
201
0
17 Apr 2018
MaskGAN: Better Text Generation via Filling in the______
W. Fedus
Ian Goodfellow
Andrew M. Dai
82
470
0
23 Jan 2018
Deep Reinforcement Learning for De-Novo Drug Design
Mariya Popova
Olexandr Isayev
Alexander Tropsha
88
1,031
0
29 Nov 2017
Long Text Generation via Adversarial Training with Leaked Information
Jiaxian Guo
Sidi Lu
Han Cai
Weinan Zhang
Yong Yu
Jun Wang
GAN
61
498
0
24 Sep 2017
Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback
Khanh Nguyen
Hal Daumé
Jordan L. Boyd-Graber
65
138
0
24 Jul 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
499
19,065
0
20 Jul 2017
Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models
G. L. Guimaraes
Benjamín Sánchez-Lengeling
Carlos Outeiral
Pedro Luis Cunha Farias
Alán Aspuru-Guzik
GAN
82
525
0
30 May 2017
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRM
SSL
108
2,439
0
15 May 2017
A Deep Reinforced Model for Abstractive Summarization
Romain Paulus
Caiming Xiong
R. Socher
AI4TS
199
1,558
0
11 May 2017
Molecular De Novo Design through Deep Reinforcement Learning
Marcus Olivecrona
T. Blaschke
Ola Engkvist
Hongming Chen
BDL
126
1,016
0
25 Apr 2017
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
Baolin Peng
Xiujun Li
Lihong Li
Jianfeng Gao
Asli Celikyilmaz
Sungjin Lee
Kam-Fai Wong
BDL
62
190
0
10 Apr 2017
Improved Training of Wasserstein GANs
Ishaan Gulrajani
Faruk Ahmed
Martín Arjovsky
Vincent Dumoulin
Aaron Courville
GAN
207
9,548
0
31 Mar 2017
1
2
Next