ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.00031
  4. Cited By
Reinforcement Learning for Generative AI: State of the Art,
  Opportunities and Open Research Challenges
v1v2v3v4 (latest)

Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges

31 July 2023
Giorgio Franceschelli
Mirco Musolesi
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges"

50 / 65 papers shown
Title
Neural Collage Transfer: Artistic Reconstruction via Material
  Manipulation
Neural Collage Transfer: Artistic Reconstruction via Material Manipulation
Ganghun Lee
Minji Kim
Yunsu Lee
M. Lee
Byoung-Tak Zhang
DiffM
43
1
0
03 Nov 2023
Reinforcement Learning for Generative AI: A Survey
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
149
13
0
28 Aug 2023
Leftover Lunch: Advantage-based Offline Reinforcement Learning for
  Language Models
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models
Ashutosh Baheti
Ximing Lu
Faeze Brahman
Ronan Le Bras
Maarten Sap
Mark O. Riedl
70
9
0
24 May 2023
Toxicity in ChatGPT: Analyzing Persona-assigned Language Models
Toxicity in ChatGPT: Analyzing Persona-assigned Language Models
Ameet Deshpande
Vishvak Murahari
Tanmay Rajpurohit
Ashwin Kalyan
Karthik Narasimhan
LM&MALLMAG
67
365
0
11 Apr 2023
On the Creativity of Large Language Models
On the Creativity of Large Language Models
Giorgio Franceschelli
Mirco Musolesi
168
56
0
27 Mar 2023
Execution-based Code Generation using Deep Reinforcement Learning
Execution-based Code Generation using Deep Reinforcement Learning
Parshin Shojaee
Aneesh Jain
Sindhu Tipirneni
Chandan K. Reddy
76
58
0
31 Jan 2023
Constitutional AI: Harmlessness from AI Feedback
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDaMoMe
199
1,634
0
15 Dec 2022
Rainier: Reinforced Knowledge Introspector for Commonsense Question
  Answering
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering
Jiacheng Liu
Skyler Hallinan
Ximing Lu
Pengfei He
Sean Welleck
Hannaneh Hajishirzi
Yejin Choi
RALM
83
60
0
06 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing:
  Benchmarks, Baselines, and Building Blocks for Natural Language Policy
  Optimization
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
95
247
0
03 Oct 2022
Quark: Controllable Text Generation with Reinforced Unlearning
Quark: Controllable Text Generation with Reinforced Unlearning
Ximing Lu
Sean Welleck
Jack Hessel
Liwei Jiang
Lianhui Qin
Peter West
Prithviraj Ammanabrolu
Yejin Choi
MU
142
216
0
26 May 2022
Hierarchical Text-Conditional Image Generation with CLIP Latents
Hierarchical Text-Conditional Image Generation with CLIP Latents
Aditya A. Ramesh
Prafulla Dhariwal
Alex Nichol
Casey Chu
Mark Chen
VLMDiffM
404
6,866
0
13 Apr 2022
Generative Cooperative Networks for Natural Language Generation
Generative Cooperative Networks for Natural Language Generation
Sylvain Lamprier
Thomas Scialom
Antoine Chaffin
Vincent Claveau
Ewa Kijak
Jacopo Staiano
Benjamin Piwowarski
GAN
90
13
0
28 Jan 2022
LaMDA: Language Models for Dialog Applications
LaMDA: Language Models for Dialog Applications
R. Thoppilan
Daniel De Freitas
Jamie Hall
Noam M. Shazeer
Apoorv Kulshreshtha
...
Blaise Aguera-Arcas
Claire Cui
M. Croak
Ed H. Chi
Quoc Le
ALM
137
1,595
0
20 Jan 2022
Recursively Summarizing Books with Human Feedback
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
165
303
0
22 Sep 2021
Design Guidelines for Prompt Engineering Text-to-Image Generative Models
Design Guidelines for Prompt Engineering Text-to-Image Generative Models
Vivian Liu
Lydia B. Chilton
65
492
0
14 Sep 2021
Revisiting the Weaknesses of Reinforcement Learning for Neural Machine
  Translation
Revisiting the Weaknesses of Reinforcement Learning for Neural Machine Translation
Samuel Kiegeland
Julia Kreutzer
AAML
70
46
0
16 Jun 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
415
4,953
0
24 Feb 2021
Curiosity in exploring chemical space: Intrinsic rewards for deep
  molecular reinforcement learning
Curiosity in exploring chemical space: Intrinsic rewards for deep molecular reinforcement learning
Luca Thiede
Mario Krenn
AkshatKumar Nigam
Alán Aspuru-Guzik
50
31
0
17 Dec 2020
Combining Semantic Guidance and Deep Reinforcement Learning For
  Generating Human Level Paintings
Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level Paintings
Jaskirat Singh
Liang Zheng
55
20
0
25 Nov 2020
Human-centric Dialog Training via Offline Reinforcement Learning
Human-centric Dialog Training via Offline Reinforcement Learning
Natasha Jaques
J. Shen
Asma Ghandeharioun
Craig Ferguson
Àgata Lapedriza
Noah J. Jones
S. Gu
Rosalind W. Picard
OffRL
77
95
0
12 Oct 2020
ColdGANs: Taming Language GANs with Cautious Sampling Strategies
ColdGANs: Taming Language GANs with Cautious Sampling Strategies
Thomas Scialom
Paul-Alexis Dray
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
GANSyDa
52
18
0
08 Jun 2020
Fine-Tuning Pretrained Language Models: Weight Initializations, Data
  Orders, and Early Stopping
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping
Jesse Dodge
Gabriel Ilharco
Roy Schwartz
Ali Farhadi
Hannaneh Hajishirzi
Noah A. Smith
99
597
0
15 Feb 2020
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement
  Learning
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning
Nan Jiang
Sheng Jin
Z. Duan
Changshui Zhang
OffRL
86
50
0
08 Feb 2020
Graph Constrained Reinforcement Learning for Natural Language Action
  Spaces
Graph Constrained Reinforcement Learning for Natural Language Action Spaces
Prithviraj Ammanabrolu
Matthew J. Hausknecht
AI4CELLMAG
69
129
0
23 Jan 2020
Dream to Control: Learning Behaviors by Latent Imagination
Dream to Control: Learning Behaviors by Latent Imagination
Danijar Hafner
Timothy Lillicrap
Jimmy Ba
Mohammad Norouzi
VLM
124
1,356
0
03 Dec 2019
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and
  Algorithms
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
Kai Zhang
Zhuoran Yang
Tamer Basar
196
1,218
0
24 Nov 2019
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
466
1,734
0
18 Sep 2019
A survey on intrinsic motivation in reinforcement learning
A survey on intrinsic motivation in reinforcement learning
A. Aubret
L. Matignon
S. Hassas
AI4CE
85
144
0
19 Aug 2019
Reward Learning for Efficient Reinforcement Learning in Extractive
  Document Summarisation
Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation
Yang Gao
Christian M. Meyer
Mohsen Mesgar
Iryna Gurevych
83
23
0
30 Jul 2019
On the Weaknesses of Reinforcement Learning for Neural Machine
  Translation
On the Weaknesses of Reinforcement Learning for Neural Machine Translation
Leshem Choshen
Lior Fox
Zohar Aizenbud
Omri Abend
110
108
0
03 Jul 2019
BERTScore: Evaluating Text Generation with BERT
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
326
5,845
0
21 Apr 2019
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
136
2,445
0
13 Dec 2018
Dialogue Generation: From Imitation Learning to Inverse Reinforcement
  Learning
Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning
Ziming Li
Julia Kiseleva
Maarten de Rijke
50
51
0
09 Dec 2018
Scalable agent alignment via reward modeling: a research direction
Scalable agent alignment via reward modeling: a research direction
Jan Leike
David M. Krueger
Tom Everitt
Miljan Martic
Vishal Maini
Shane Legg
98
420
0
19 Nov 2018
Controllable Neural Story Plot Generation via Reward Shaping
Controllable Neural Story Plot Generation via Reward Shaping
Pradyumna Tambwekar
Murtaza Dhuliawala
Lara J. Martin
Animesh Mehta
Brent Harrison
Mark O. Riedl
77
88
0
27 Sep 2018
Graph Convolutional Policy Network for Goal-Directed Molecular Graph
  Generation
Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation
Jiaxuan You
Bowen Liu
Rex Ying
Vijay S. Pande
J. Leskovec
GNN
293
902
0
07 Jun 2018
Toward Diverse Text Generation with Inverse Reinforcement Learning
Toward Diverse Text Generation with Inverse Reinforcement Learning
Zhan Shi
Xinchi Chen
Xipeng Qiu
Xuanjing Huang
53
104
0
30 Apr 2018
A Call for Clarity in Reporting BLEU Scores
A Call for Clarity in Reporting BLEU Scores
Matt Post
160
2,994
0
23 Apr 2018
Multi-Reward Reinforced Summarization with Saliency and Entailment
Multi-Reward Reinforced Summarization with Saliency and Entailment
Ramakanth Pasunuru
Joey Tianyi Zhou
56
201
0
17 Apr 2018
MaskGAN: Better Text Generation via Filling in the______
MaskGAN: Better Text Generation via Filling in the______
W. Fedus
Ian Goodfellow
Andrew M. Dai
82
470
0
23 Jan 2018
Deep Reinforcement Learning for De-Novo Drug Design
Deep Reinforcement Learning for De-Novo Drug Design
Mariya Popova
Olexandr Isayev
Alexander Tropsha
88
1,031
0
29 Nov 2017
Long Text Generation via Adversarial Training with Leaked Information
Long Text Generation via Adversarial Training with Leaked Information
Jiaxian Guo
Sidi Lu
Han Cai
Weinan Zhang
Yong Yu
Jun Wang
GAN
61
498
0
24 Sep 2017
Reinforcement Learning for Bandit Neural Machine Translation with
  Simulated Human Feedback
Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback
Khanh Nguyen
Hal Daumé
Jordan L. Boyd-Graber
65
138
0
24 Jul 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
499
19,065
0
20 Jul 2017
Objective-Reinforced Generative Adversarial Networks (ORGAN) for
  Sequence Generation Models
Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models
G. L. Guimaraes
Benjamín Sánchez-Lengeling
Carlos Outeiral
Pedro Luis Cunha Farias
Alán Aspuru-Guzik
GAN
82
525
0
30 May 2017
Curiosity-driven Exploration by Self-supervised Prediction
Curiosity-driven Exploration by Self-supervised Prediction
Deepak Pathak
Pulkit Agrawal
Alexei A. Efros
Trevor Darrell
LRMSSL
108
2,439
0
15 May 2017
A Deep Reinforced Model for Abstractive Summarization
A Deep Reinforced Model for Abstractive Summarization
Romain Paulus
Caiming Xiong
R. Socher
AI4TS
199
1,558
0
11 May 2017
Molecular De Novo Design through Deep Reinforcement Learning
Molecular De Novo Design through Deep Reinforcement Learning
Marcus Olivecrona
T. Blaschke
Ola Engkvist
Hongming Chen
BDL
126
1,016
0
25 Apr 2017
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep
  Reinforcement Learning
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
Baolin Peng
Xiujun Li
Lihong Li
Jianfeng Gao
Asli Celikyilmaz
Sungjin Lee
Kam-Fai Wong
BDL
62
190
0
10 Apr 2017
Improved Training of Wasserstein GANs
Improved Training of Wasserstein GANs
Ishaan Gulrajani
Faruk Ahmed
Martín Arjovsky
Vincent Dumoulin
Aaron Courville
GAN
207
9,548
0
31 Mar 2017
12
Next