ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.01214
  4. Cited By
Better Rewards Yield Better Summaries: Learning to Summarise Without
  References

Better Rewards Yield Better Summaries: Learning to Summarise Without References

3 September 2019
F. Böhm
Yang Gao
Christian M. Meyer
Ori Shapira
Ido Dagan
Iryna Gurevych
ArXivPDFHTML

Papers citing "Better Rewards Yield Better Summaries: Learning to Summarise Without References"

28 / 28 papers shown
Title
Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model
Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model
Qi Gou
Cam-Tu Nguyen
35
8
0
28 Mar 2024
A Critical Evaluation of AI Feedback for Aligning Large Language Models
A Critical Evaluation of AI Feedback for Aligning Large Language Models
Archit Sharma
Sedrick Scott Keh
Eric Mitchell
Chelsea Finn
Kushal Arora
Thomas Kollar
ALM
LLMAG
29
23
0
19 Feb 2024
Reinforcement Learning from Statistical Feedback: the Journey from AB
  Testing to ANT Testing
Reinforcement Learning from Statistical Feedback: the Journey from AB Testing to ANT Testing
Feiyang Han
Yimin Wei
Zhaofeng Liu
Yanxing Qi
40
1
0
24 Nov 2023
Improving Summarization with Human Edits
Improving Summarization with Human Edits
Zonghai Yao
Benjamin J Schloss
Sai P. Selvaraj
32
3
0
09 Oct 2023
Aligning Language Models with Human Preferences via a Bayesian Approach
Aligning Language Models with Human Preferences via a Bayesian Approach
Jiashuo Wang
Haozhao Wang
Shichao Sun
Wenjie Li
ALM
42
22
0
09 Oct 2023
Optimal Control of Nonlinear Systems with Unknown Dynamics
Optimal Control of Nonlinear Systems with Unknown Dynamics
Wenjian Hao
Paulo Heredia
Bowen Huang
42
1
0
24 May 2023
Reward Learning as Doubly Nonparametric Bandits: Optimal Design and
  Scaling Laws
Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws
Kush S. Bhatia
Wenshuo Guo
Jacob Steinhardt
27
0
0
23 Feb 2023
Human-in-the-loop Abstractive Dialogue Summarization
Human-in-the-loop Abstractive Dialogue Summarization
Jiaao Chen
Mohan Dodda
Diyi Yang
28
10
0
19 Dec 2022
Evaluating Human-Language Model Interaction
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
60
100
0
19 Dec 2022
The CRINGE Loss: Learning what language not to model
The CRINGE Loss: Learning what language not to model
Leonard Adolphs
Tianyu Gao
Jing Xu
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
MU
31
35
0
10 Nov 2022
MACSum: Controllable Summarization with Mixed Attributes
MACSum: Controllable Summarization with Mixed Attributes
Yusen Zhang
Yang Liu
Ziyi Yang
Yuwei Fang
Yulong Chen
Dragomir R. Radev
Chenguang Zhu
Michael Zeng
Rui Zhang
37
15
0
09 Nov 2022
Universal Evasion Attacks on Summarization Scoring
Universal Evasion Attacks on Summarization Scoring
Wenchuan Mu
Kwan Hui Lim
AAML
38
1
0
25 Oct 2022
Towards Interpretable Summary Evaluation via Allocation of Contextual
  Embeddings to Reference Text Topics
Towards Interpretable Summary Evaluation via Allocation of Contextual Embeddings to Reference Text Topics
Ben Schaper
Christopher Lohse
Marcell Streile
Andrea Giovannini
Richard Osuala
24
1
0
25 Oct 2022
Innovations in Neural Data-to-text Generation: A Survey
Innovations in Neural Data-to-text Generation: A Survey
Mandar Sharma
Ajay K. Gogineni
Naren Ramakrishnan
34
10
0
25 Jul 2022
Offline RL for Natural Language Generation with Implicit Language Q
  Learning
Offline RL for Natural Language Generation with Implicit Language Q Learning
Charles Burton Snell
Ilya Kostrikov
Yi Su
Mengjiao Yang
Sergey Levine
OffRL
144
103
0
05 Jun 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
375
12,081
0
04 Mar 2022
Reward Modeling for Mitigating Toxicity in Transformer-based Language
  Models
Reward Modeling for Mitigating Toxicity in Transformer-based Language Models
Farshid Faal
K. Schmitt
Jia Yuan Yu
13
24
0
19 Feb 2022
Recursively Summarizing Books with Human Feedback
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
37
296
0
22 Sep 2021
Automatic Text Evaluation through the Lens of Wasserstein Barycenters
Automatic Text Evaluation through the Lens of Wasserstein Barycenters
Pierre Colombo
Guillaume Staerman
Chloé Clavel
Pablo Piantanida
27
41
0
27 Aug 2021
CapWAP: Captioning with a Purpose
CapWAP: Captioning with a Purpose
Adam Fisch
Kenton Lee
Ming-Wei Chang
J. Clark
Regina Barzilay
8
11
0
09 Nov 2020
What Have We Achieved on Text Summarization?
What Have We Achieved on Text Summarization?
Dandan Huang
Leyang Cui
Sen Yang
Guangsheng Bao
Kun Wang
Jun Xie
Yue Zhang
40
109
0
09 Oct 2020
Learning to summarize from human feedback
Learning to summarize from human feedback
Nisan Stiennon
Long Ouyang
Jeff Wu
Daniel M. Ziegler
Ryan J. Lowe
Chelsea Voss
Alec Radford
Dario Amodei
Paul Christiano
ALM
56
1,994
0
02 Sep 2020
SummEval: Re-evaluating Summarization Evaluation
SummEval: Re-evaluating Summarization Evaluation
Alexander R. Fabbri
Wojciech Kry'sciñski
Bryan McCann
Caiming Xiong
R. Socher
Dragomir R. Radev
HILM
38
691
0
24 Jul 2020
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for
  Multi-Document Summarization
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization
Yang Gao
Wei-Ye Zhao
Steffen Eger
ELM
27
124
0
07 May 2020
MLSUM: The Multilingual Summarization Corpus
MLSUM: The Multilingual Summarization Corpus
Thomas Scialom
Paul-Alexis Dray
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
32
173
0
30 Apr 2020
Discriminative Adversarial Search for Abstractive Summarization
Discriminative Adversarial Search for Abstractive Summarization
Thomas Scialom
Paul-Alexis Dray
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
34
33
0
24 Feb 2020
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
301
1,616
0
18 Sep 2019
Convolutional Neural Networks for Sentence Classification
Convolutional Neural Networks for Sentence Classification
Yoon Kim
AILaw
VLM
309
13,373
0
25 Aug 2014
1