ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.00862
  4. Cited By
CTRLEval: An Unsupervised Reference-Free Metric for Evaluating
  Controlled Text Generation

CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation

2 April 2022
Pei Ke
Hao Zhou
Yankai Lin
Peng Li
Jie Zhou
Xiaoyan Zhu
Minlie Huang
ArXivPDFHTML

Papers citing "CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation"

31 / 31 papers shown
Title
Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models
Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models
Banca Calvo Figueras
Rodrigo Agerri
ALM
ELM
LRM
22
1
0
16 May 2025
JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry
JaccDiv: A Metric and Benchmark for Quantifying Diversity of Generated Marketing Text in the Music Industry
Anum Afzal
Alexandre Mercier
Florian Matthes
65
0
0
29 Apr 2025
What Makes a Good Story and How Can We Measure It? A Comprehensive
  Survey of Story Evaluation
What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation
Dingyi Yang
Qin Jin
48
5
0
26 Aug 2024
Automatic Metrics in Natural Language Generation: A Survey of Current
  Evaluation Practices
Automatic Metrics in Natural Language Generation: A Survey of Current Evaluation Practices
Patrícia Schmidtová
Saad Mahamood
Simone Balloccu
Ondřej Dušek
Albert Gatt
Dimitra Gkatzia
David M. Howcroft
Ondřej Plátek
Adarsa Sivaprasad
50
3
0
17 Aug 2024
Themis: Towards Flexible and Interpretable NLG Evaluation
Themis: Towards Flexible and Interpretable NLG Evaluation
Xinyu Hu
Li Lin
Mingqi Gao
Xunjian Yin
Xiaojun Wan
ELM
34
7
0
26 Jun 2024
A LLM-Based Ranking Method for the Evaluation of Automatic
  Counter-Narrative Generation
A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation
I. Zubiaga
A. Soroa
Rodrigo Agerri
42
5
0
21 Jun 2024
RepEval: Effective Text Evaluation with LLM Representation
RepEval: Effective Text Evaluation with LLM Representation
Shuqian Sheng
Yi Xu
Tianhang Zhang
Zanwei Shen
Luoyi Fu
Jiaxin Ding
Lei Zhou
Xinbing Wang
Cheng Zhou
35
2
0
30 Apr 2024
Exploring the Limits of Fine-grained LLM-based Physics Inference via
  Premise Removal Interventions
Exploring the Limits of Fine-grained LLM-based Physics Inference via Premise Removal Interventions
Jordan Meadows
Tamsin James
André Freitas
ReLM
LRM
AI4CE
41
1
0
29 Apr 2024
Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models
Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models
Yukyung Lee
Soonwon Ka
Bokyung Son
Pilsung Kang
Jaewook Kang
LLMAG
52
6
0
22 Apr 2024
Benchmarking Large Language Models on Controllable Generation under
  Diversified Instructions
Benchmarking Large Language Models on Controllable Generation under Diversified Instructions
Yihan Chen
Benfeng Xu
Quan Wang
Yi Liu
Zhendong Mao
ALM
ELM
32
26
0
01 Jan 2024
Event Causality Is Key to Computational Story Understanding
Event Causality Is Key to Computational Story Understanding
Yidan Sun
Qin Chao
Boyang Albert Li
26
5
0
16 Nov 2023
Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time
  Controllable Text Generation
Air-Decoding: Attribute Distribution Reconstruction for Decoding-Time Controllable Text Generation
Tianqi Zhong
Quan Wang
Jingxuan Han
Yongdong Zhang
Zhendong Mao
33
9
0
23 Oct 2023
Language Models Hallucinate, but May Excel at Fact Verification
Language Models Hallucinate, but May Excel at Fact Verification
Jian Guan
Jesse Dodge
David Wadden
Minlie Huang
Hao Peng
LRM
HILM
40
28
0
23 Oct 2023
Towards Better Evaluation of Instruction-Following: A Case-Study in
  Summarization
Towards Better Evaluation of Instruction-Following: A Case-Study in Summarization
Ondrej Skopek
Rahul Aralikatte
Sian Gooding
Victor Carbune
ELM
49
18
0
12 Oct 2023
Towards Mitigating Hallucination in Large Language Models via
  Self-Reflection
Towards Mitigating Hallucination in Large Language Models via Self-Reflection
Ziwei Ji
Tiezheng Yu
Yan Xu
Nayeon Lee
Etsuko Ishii
Pascale Fung
HILM
11
57
0
10 Oct 2023
Careful Whisper -- leveraging advances in automatic speech recognition
  for robust and interpretable aphasia subtype classification
Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification
Laurin Wagner
M. Zusag
Theresa Bloder
24
9
0
02 Aug 2023
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed
  Question Answering
DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering
Pei Ke
Fei Huang
Fei Mi
Yasheng Wang
Qun Liu
Xiaoyan Zhu
Minlie Huang
ReLM
ELM
49
10
0
13 Jul 2023
Seen to Unseen: Exploring Compositional Generalization of
  Multi-Attribute Controllable Dialogue Generation
Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation
Weihao Zeng
Lulu Zhao
Keqing He
Ruotong Geng
Jingang Wang
Wei Wu
Weiran Xu
45
3
0
17 Jun 2023
Large Language Models, scientific knowledge and factuality: A systematic
  analysis in antibiotic discovery
Large Language Models, scientific knowledge and factuality: A systematic analysis in antibiotic discovery
Magdalena Wysocka
Oskar Wysocki
Maxime Delmas
V. Mutel
André Freitas
LM&MA
43
6
0
28 May 2023
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric
  Preference Checklist
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference Checklist
Iftitahu Ni'mah
Meng Fang
Vlado Menkovski
Mykola Pechenizkiy
44
13
0
15 May 2023
ChatLog: Carefully Evaluating the Evolution of ChatGPT Across Time
ChatLog: Carefully Evaluating the Evolution of ChatGPT Across Time
Shangqing Tu
Chunyang Li
Jifan Yu
Xiaozhi Wang
Lei Hou
Juanzi Li
LLMAG
AI4MH
75
10
0
27 Apr 2023
AI vs. Human -- Differentiation Analysis of Scientific Content
  Generation
AI vs. Human -- Differentiation Analysis of Scientific Content Generation
Yongqiang Ma
Jiawei Liu
Fan Yi
Qikai Cheng
Yong Huang
Wei Lu
Xiaozhong Liu
DeLMO
14
59
0
24 Jan 2023
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation
Tianxing He
Jingyu Zhang
Tianle Wang
Sachin Kumar
Kyunghyun Cho
James R. Glass
Yulia Tsvetkov
50
44
0
20 Dec 2022
Unified Detoxifying and Debiasing in Language Generation via
  Inference-time Adaptive Optimization
Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization
Zonghan Yang
Xiaoyuan Yi
Peng Li
Yang Liu
Xing Xie
38
33
0
10 Oct 2022
DIRECTOR: Generator-Classifiers For Supervised Language Modeling
DIRECTOR: Generator-Classifiers For Supervised Language Modeling
Kushal Arora
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
VLM
32
40
0
15 Jun 2022
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation
  Practices for Generated Text
Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text
Sebastian Gehrmann
Elizabeth Clark
Thibault Sellam
ELM
AI4CE
71
184
0
14 Feb 2022
A Survey of Controllable Text Generation using Transformer-based
  Pre-trained Language Models
A Survey of Controllable Text Generation using Transformer-based Pre-trained Language Models
Hanqing Zhang
Haolin Song
Shaoyu Li
Ming Zhou
Dawei Song
57
215
0
14 Jan 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
283
3,879
0
18 Apr 2021
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
243
1,930
0
31 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural
  Language Inference
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,591
0
21 Jan 2020
Adversarial Evaluation of Dialogue Models
Adversarial Evaluation of Dialogue Models
Anjuli Kannan
Oriol Vinyals
AAML
ALM
141
76
0
27 Jan 2017
1