Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.00862
Cited By
CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation
2 April 2022
Pei Ke
Hao Zhou
Yankai Lin
Peng Li
Jie Zhou
Xiaoyan Zhu
Minlie Huang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation"
31 / 31 papers shown
Title
Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models
Banca Calvo Figueras
Rodrigo Agerri
ALM
ELM
LRM
22
1
0
16 May 2025
JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry
Anum Afzal
Alexandre Mercier
Florian Matthes
65
0
0
29 Apr 2025
What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation
Dingyi Yang
Qin Jin
48
5
0
26 Aug 2024
Automatic Metrics in Natural Language Generation: A Survey of Current Evaluation Practices
Patrícia Schmidtová
Saad Mahamood
Simone Balloccu
Ondřej Dušek
Albert Gatt
Dimitra Gkatzia
David M. Howcroft
Ondřej Plátek
Adarsa Sivaprasad
50
3
0
17 Aug 2024
Themis: Towards Flexible and Interpretable NLG Evaluation
Xinyu Hu
Li Lin
Mingqi Gao
Xunjian Yin
Xiaojun Wan
ELM
34
7
0
26 Jun 2024
A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation
I. Zubiaga
A. Soroa
Rodrigo Agerri
42
5
0
21 Jun 2024
RepEval: Effective Text Evaluation with LLM Representation
Shuqian Sheng
Yi Xu
Tianhang Zhang
Zanwei Shen
Luoyi Fu
Jiaxin Ding
Lei Zhou
Xinbing Wang
Cheng Zhou
35
2
0
30 Apr 2024
Exploring the Limits of Fine-grained LLM-based Physics Inference via Premise Removal Interventions
Jordan Meadows
Tamsin James
André Freitas
ReLM
LRM
AI4CE
41
1
0
29 Apr 2024
Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models
Yukyung Lee
Soonwon Ka
Bokyung Son
Pilsung Kang
Jaewook Kang
LLMAG
52
6
0
22 Apr 2024
Benchmarking Large Language Models on Controllable Generation under Diversified Instructions
Yihan Chen
Benfeng Xu
Quan Wang
Yi Liu
Zhendong Mao
ALM
ELM
32
26
0
01 Jan 2024
Event Causality Is Key to Computational Story Understanding
Yidan Sun
Qin Chao
Boyang Albert Li
26
5
0
16 Nov 2023
Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time Controllable Text Generation
Tianqi Zhong
Quan Wang
Jingxuan Han
Yongdong Zhang
Zhendong Mao
33
9
0
23 Oct 2023
Language Models Hallucinate, but May Excel at Fact Verification
Jian Guan
Jesse Dodge
David Wadden
Minlie Huang
Hao Peng
LRM
HILM
40
28
0
23 Oct 2023
Towards Better Evaluation of Instruction-Following: A Case-Study in Summarization
Ondrej Skopek
Rahul Aralikatte
Sian Gooding
Victor Carbune
ELM
49
18
0
12 Oct 2023
Towards Mitigating Hallucination in Large Language Models via Self-Reflection
Ziwei Ji
Tiezheng Yu
Yan Xu
Nayeon Lee
Etsuko Ishii
Pascale Fung
HILM
11
57
0
10 Oct 2023
Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification
Laurin Wagner
M. Zusag
Theresa Bloder
24
9
0
02 Aug 2023
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering
Pei Ke
Fei Huang
Fei Mi
Yasheng Wang
Qun Liu
Xiaoyan Zhu
Minlie Huang
ReLM
ELM
49
10
0
13 Jul 2023
Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation
Weihao Zeng
Lulu Zhao
Keqing He
Ruotong Geng
Jingang Wang
Wei Wu
Weiran Xu
45
3
0
17 Jun 2023
Large Language Models, scientific knowledge and factuality: A systematic analysis in antibiotic discovery
Magdalena Wysocka
Oskar Wysocki
Maxime Delmas
V. Mutel
André Freitas
LM&MA
43
6
0
28 May 2023
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference Checklist
Iftitahu Ni'mah
Meng Fang
Vlado Menkovski
Mykola Pechenizkiy
44
13
0
15 May 2023
ChatLog: Carefully Evaluating the Evolution of ChatGPT Across Time
Shangqing Tu
Chunyang Li
Jifan Yu
Xiaozhi Wang
Lei Hou
Juanzi Li
LLMAG
AI4MH
75
10
0
27 Apr 2023
AI vs. Human -- Differentiation Analysis of Scientific Content Generation
Yongqiang Ma
Jiawei Liu
Fan Yi
Qikai Cheng
Yong Huang
Wei Lu
Xiaozhong Liu
DeLMO
14
59
0
24 Jan 2023
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
Tianxing He
Jingyu Zhang
Tianle Wang
Sachin Kumar
Kyunghyun Cho
James R. Glass
Yulia Tsvetkov
50
44
0
20 Dec 2022
Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization
Zonghan Yang
Xiaoyuan Yi
Peng Li
Yang Liu
Xing Xie
38
33
0
10 Oct 2022
DIRECTOR: Generator-Classifiers For Supervised Language Modeling
Kushal Arora
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
VLM
32
40
0
15 Jun 2022
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text
Sebastian Gehrmann
Elizabeth Clark
Thibault Sellam
ELM
AI4CE
71
184
0
14 Feb 2022
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
57
215
0
14 Jan 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
283
3,879
0
18 Apr 2021
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
243
1,930
0
31 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,591
0
21 Jan 2020
Adversarial Evaluation of Dialogue Models
Anjuli Kannan
Oriol Vinyals
AAML
ALM
141
76
0
27 Jan 2017
1