ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.17136
  4. Cited By
CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization

CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization

21 March 2025
Brihi Joshi
Sriram Venkatapathy
Mohit Bansal
Nanyun Peng
Haw-Shiuan Chang
    LRM
ArXiv (abs)PDFHTML

Papers citing "CoKe: Customizable Fine-Grained Story Evaluation via Chain-of-Keyword Rationalization"

23 / 23 papers shown
Title
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of
  Diverse Models
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
Pat Verga
Sebastian Hofstatter
Sophia Althammer
Yixuan Su
Aleksandra Piktus
Arkady Arkhangorodsky
Minjie Xu
Naomi White
Patrick Lewis
ALMELM
104
104
0
29 Apr 2024
Understanding the Effects of RLHF on LLM Generalisation and Diversity
Understanding the Effects of RLHF on LLM Generalisation and Diversity
Robert Kirk
Ishita Mediratta
Christoforos Nalmpantis
Jelena Luketina
Eric Hambro
Edward Grefenstette
Roberta Raileanu
AI4CEALM
182
149
0
10 Oct 2023
AMRs Assemble! Learning to Ensemble with Autoregressive Models for AMR
  Parsing
AMRs Assemble! Learning to Ensemble with Autoregressive Models for AMR Parsing
Abelardo Carlos Martínez Lorenzo
Pere-Lluís Huguet Cabot
Roberto Navigli
72
7
0
19 Jun 2023
Fine-Grained Human Feedback Gives Better Rewards for Language Model
  Training
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Zeqiu Wu
Yushi Hu
Weijia Shi
Nouha Dziri
Alane Suhr
Prithviraj Ammanabrolu
Noah A. Smith
Mari Ostendorf
Hannaneh Hajishirzi
ALM
149
334
0
02 Jun 2023
Let's Verify Step by Step
Let's Verify Step by Step
Hunter Lightman
V. Kosaraju
Yura Burda
Harrison Edwards
Bowen Baker
Teddy Lee
Jan Leike
John Schulman
Ilya Sutskever
K. Cobbe
ALMOffRLLRM
195
1,233
0
31 May 2023
The CoT Collection: Improving Zero-shot and Few-shot Learning of
  Language Models via Chain-of-Thought Fine-Tuning
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Seungone Kim
Se June Joo
Doyoung Kim
Joel Jang
Seonghyeon Ye
Jamin Shin
Minjoon Seo
ALMRALMLRM
87
107
0
23 May 2023
ZARA: Improving Few-Shot Self-Rationalization for Small Language Models
ZARA: Improving Few-Shot Self-Rationalization for Small Language Models
Wei-Lin Chen
An-Zi Yen
Cheng-Kuang Wu
Hen-Hsen Huang
Hsin-Hsi Chen
ReLMLRM
52
11
0
12 May 2023
Follow the Wisdom of the Crowd: Effective Text Generation via Minimum
  Bayes Risk Decoding
Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding
Mirac Suzgun
Luke Melas-Kyriazi
Dan Jurafsky
67
45
0
14 Nov 2022
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Ming Zhong
Yang Liu
Da Yin
Yuning Mao
Yizhu Jiao
Peng Liu
Chenguang Zhu
Heng Ji
Jiawei Han
ELM
85
276
0
13 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing:
  Benchmarks, Baselines, and Building Blocks for Natural Language Policy
  Optimization
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
101
248
0
03 Oct 2022
Complexity-Based Prompting for Multi-Step Reasoning
Complexity-Based Prompting for Multi-Step Reasoning
Yao Fu
Hao-Chun Peng
Ashish Sabharwal
Peter Clark
Tushar Khot
ReLMLRM
232
444
0
03 Oct 2022
Maieutic Prompting: Logically Consistent Reasoning with Recursive
  Explanations
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Jaehun Jung
Lianhui Qin
Sean Welleck
Faeze Brahman
Chandra Bhagavatula
Ronan Le Bras
Yejin Choi
ReLMLRM
273
196
0
24 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
886
13,176
0
04 Mar 2022
Annotators with Attitudes: How Annotator Beliefs And Identities Bias
  Toxic Language Detection
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
Maarten Sap
Swabha Swayamdipta
Laura Vianna
Xuhui Zhou
Yejin Choi
Noah A. Smith
81
283
0
15 Nov 2021
BARTScore: Evaluating Generated Text as Text Generation
BARTScore: Evaluating Generated Text as Text Generation
Weizhe Yuan
Graham Neubig
Pengfei Liu
123
849
0
22 Jun 2021
Measuring Association Between Labels and Free-Text Rationales
Measuring Association Between Labels and Free-Text Rationales
Sarah Wiegreffe
Ana Marasović
Noah A. Smith
313
182
0
24 Oct 2020
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation
Jian Guan
Minlie Huang
61
70
0
16 Sep 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
880
42,379
0
28 May 2020
BERTScore: Evaluating Text Generation with BERT
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
352
5,868
0
21 Apr 2019
e-SNLI: Natural Language Inference with Natural Language Explanations
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
425
640
0
04 Dec 2018
Hierarchical Neural Story Generation
Hierarchical Neural Story Generation
Angela Fan
M. Lewis
Yann N. Dauphin
DiffM
183
1,626
0
13 May 2018
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
541
19,265
0
20 Jul 2017
CIDEr: Consensus-based Image Description Evaluation
CIDEr: Consensus-based Image Description Evaluation
Ramakrishna Vedantam
C. L. Zitnick
Devi Parikh
297
4,508
0
20 Nov 2014
1