ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.08433
  4. Cited By
A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative
  Writing

A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative Writing

12 October 2023
Carlos Gómez-Rodríguez
Paul Williams
ArXivPDFHTML

Papers citing "A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative Writing"

49 / 49 papers shown
Title
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
Calibrating Uncertainty Quantification of Multi-Modal LLMs using Grounding
Trilok Padhi
R. Kaur
Adam D. Cobb
Manoj Acharya
Anirban Roy
Colin Samplawski
Brian Matejek
Alexander M. Berenbeim
Nathaniel D. Bastian
Susmit Jha
22
0
0
30 Apr 2025
Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach
Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach
Ruizhe Li
Chiwei Zhu
Benfeng Xu
Xiaorui Wang
Zhendong Mao
27
0
0
22 Apr 2025
BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation
BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation
Yiting Ran
Xintao Wang
Tian Qiu
Jiaqing Liang
Yanghua Xiao
Deqing Yang
LLMAG
SyDa
40
0
0
20 Apr 2025
Plan-and-Refine: Diverse and Comprehensive Retrieval-Augmented Generation
Plan-and-Refine: Diverse and Comprehensive Retrieval-Augmented Generation
Alireza Salemi
Chris Samarinas
Hamed Zamani
24
0
0
10 Apr 2025
Has the Creativity of Large-Language Models peaked? An analysis of inter- and intra-LLM variability
Has the Creativity of Large-Language Models peaked? An analysis of inter- and intra-LLM variability
Jennifer Haase
P. Hanel
Sebastian Pokutta
ALM
LRM
62
0
0
10 Apr 2025
ArxivBench: Can LLMs Assist Researchers in Conducting Research?
ArxivBench: Can LLMs Assist Researchers in Conducting Research?
Ning Li
Jingran Zhang
Justin Cui
19
0
0
06 Apr 2025
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM
Xinyu Fang
Z. Chen
Kai Lan
Lixin Ma
Shengyuan Ding
...
Zicheng Zhang
Guofeng Zhang
Haodong Duan
K. Chen
D. Lin
MLLM
58
1
0
18 Mar 2025
WritingBench: A Comprehensive Benchmark for Generative Writing
WritingBench: A Comprehensive Benchmark for Generative Writing
Yuning Wu
Jiahao Mei
M. Yan
Chenliang Li
Shaopeng Lai
...
Zijia Wang
J. Zhang
Mengyue Wu
Qin Jin
Fei Huang
72
3
0
07 Mar 2025
Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs' Decoding Layers
Zicong He
Boxuan Zhang
Lu Cheng
47
0
0
04 Mar 2025
MiLiC-Eval: Benchmarking Multilingual LLMs for China's Minority Languages
Chen Zhang
Mingxu Tao
Zhiyuan Liao
Yansong Feng
36
0
0
03 Mar 2025
Automated Evaluation of Meter and Rhyme in Russian Generative and Human-Authored Poetry
Automated Evaluation of Meter and Rhyme in Russian Generative and Human-Authored Poetry
Ilya Koziev
57
0
0
28 Feb 2025
Grandes modelos de lenguaje: de la predicción de palabras a la comprensión?
Grandes modelos de lenguaje: de la predicción de palabras a la comprensión?
Carlos Gómez-Rodríguez
SyDa
AILaw
ELM
VLM
100
0
0
25 Feb 2025
Do LLMs Agree on the Creativity Evaluation of Alternative Uses?
Do LLMs Agree on the Creativity Evaluation of Alternative Uses?
Abdullah Al Rabeyah
Fabrício Góes
Marco Volpe
Talles Medeiros
77
1
0
23 Nov 2024
Evaluating Creativity and Deception in Large Language Models: A
  Simulation Framework for Multi-Agent Balderdash
Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash
Parsa Hejabi
Elnaz Rahmati
Alireza S. Ziabari
Preni Golazizian
Jesse Thomason
Morteza Dehghani
LLMAG
42
1
0
15 Nov 2024
V-DPO: Mitigating Hallucination in Large Vision Language Models via
  Vision-Guided Direct Preference Optimization
V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization
Yuxi Xie
Guanzhen Li
Xiao Xu
Min-Yen Kan
MLLM
VLM
57
13
0
05 Nov 2024
Graph-based Confidence Calibration for Large Language Models
Graph-based Confidence Calibration for Large Language Models
Yukun Li
Sijia Wang
Lifu Huang
Li-Ping Liu
UQCV
31
1
0
03 Nov 2024
ControlAgent: Automating Control System Design via Novel Integration of
  LLM Agents and Domain Expertise
ControlAgent: Automating Control System Design via Novel Integration of LLM Agents and Domain Expertise
Xingang Guo
Darioush Keivan
U. Syed
Lianhui Qin
Huan Zhang
Geir Dullerud
Peter M. Seiler
Bin Hu
26
4
0
17 Oct 2024
Revealing the Barriers of Language Agents in Planning
Revealing the Barriers of Language Agents in Planning
Jian Xie
Kexun Zhang
Jiangjie Chen
Siyu Yuan
Kai Zhang
Yikai Zhang
Lei Li
Yanghua Xiao
LM&Ro
AIFin
LRM
27
6
0
16 Oct 2024
Preference Optimization with Multi-Sample Comparisons
Preference Optimization with Multi-Sample Comparisons
Chaoqi Wang
Zhuokai Zhao
Chen Zhu
Karthik Abinav Sankararaman
Michal Valko
...
Zhaorun Chen
Madian Khabsa
Yuxin Chen
Hao Ma
Sinong Wang
62
10
0
16 Oct 2024
A Framework for Collaborating a Large Language Model Tool in
  Brainstorming for Triggering Creative Thoughts
A Framework for Collaborating a Large Language Model Tool in Brainstorming for Triggering Creative Thoughts
Hung-Fu Chang
Tong Li
KELM
LLMAG
34
2
0
10 Oct 2024
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
Qiyuan Zhang
Yufei Wang
Tiezheng YU
Yuxin Jiang
Chuhan Wu
...
Xin Jiang
Lifeng Shang
Ruiming Tang
Fuyuan Lyu
Chen Ma
26
4
0
07 Oct 2024
CS4: Measuring the Creativity of Large Language Models Automatically by
  Controlling the Number of Story-Writing Constraints
CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints
Anirudh Atmakuru
Jatin Nainani
Rohith Siddhartha Reddy Bheemreddy
Anirudh Lakkaraju
Zonghai Yao
Hamed Zamani
Haw-Shiuan Chang
99
2
0
05 Oct 2024
MirrorStories: Reflecting Diversity through Personalized Narrative
  Generation with Large Language Models
MirrorStories: Reflecting Diversity through Personalized Narrative Generation with Large Language Models
Sarfaroz Yunusov
Hamza Sidat
Ali Emami
72
0
0
20 Sep 2024
Initial Development and Evaluation of the Creative Artificial
  Intelligence through Recurring Developments and Determinations (CAIRDD)
  System
Initial Development and Evaluation of the Creative Artificial Intelligence through Recurring Developments and Determinations (CAIRDD) System
Jeremy Straub
Zach Johnson
38
0
0
03 Sep 2024
What Makes a Good Story and How Can We Measure It? A Comprehensive
  Survey of Story Evaluation
What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation
Dingyi Yang
Qin Jin
36
5
0
26 Aug 2024
LLM-Generated Tips Rival Expert-Created Tips in Helping Students Answer
  Quantum-Computing Questions
LLM-Generated Tips Rival Expert-Created Tips in Helping Students Answer Quantum-Computing Questions
L. Krupp
Jonas Bley
Isacco Gobbi
Alexander Geng
Sabine Müller
...
Artur Widera
Herwig Ott
P. Lukowicz
Jakob Karolus
Maximilian Kiefer-Emmanouilidis
35
3
0
24 Jul 2024
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong-jia Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
47
28
0
22 Jul 2024
Benchmarking Language Model Creativity: A Case Study on Code Generation
Benchmarking Language Model Creativity: A Case Study on Code Generation
Yining Lu
Dixuan Wang
Tianjian Li
Dongwei Jiang
Daniel Khashabi
Meng Jiang
Daniel Khashabi
LRM
54
10
0
12 Jul 2024
JailbreakHunter: A Visual Analytics Approach for Jailbreak Prompts
  Discovery from Large-Scale Human-LLM Conversational Datasets
JailbreakHunter: A Visual Analytics Approach for Jailbreak Prompts Discovery from Large-Scale Human-LLM Conversational Datasets
Zhihua Jin
Shiyi Liu
Haotian Li
Xun Zhao
Huamin Qu
38
3
0
03 Jul 2024
Pron vs Prompt: Can Large Language Models already Challenge a
  World-Class Fiction Author at Creative Text Writing?
Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?
Guillermo Marco
Julio Gonzalo
Ramón del Castillo
M. Girona
22
10
0
01 Jul 2024
The Unlikely Duel: Evaluating Creative Writing in LLMs through a Unique
  Scenario
The Unlikely Duel: Evaluating Creative Writing in LLMs through a Unique Scenario
Carlos Gómez-Rodríguez
Paul Williams
21
1
0
22 Jun 2024
Measuring Psychological Depth in Language Models
Measuring Psychological Depth in Language Models
Fabrice Harel-Canada
Hanyu Zhou
Sreya Mupalla
Zeynep Yildiz
Amit Sahai
Nanyun Peng
35
3
0
18 Jun 2024
Is persona enough for personality? Using ChatGPT to reconstruct an
  agent's latent personality from simple descriptions
Is persona enough for personality? Using ChatGPT to reconstruct an agent's latent personality from simple descriptions
Yongyi Ji
Zhisheng Tang
M. Kejriwal
34
4
0
18 Jun 2024
MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time
MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time
Jikun Kang
Xin Zhe Li
Xi Chen
Amirreza Kazemi
Qianyi Sun
...
Xu He
Quan He
Feng Wen
Jianye Hao
Jun Yao
LRM
ReLM
29
14
0
25 May 2024
MarkLLM: An Open-Source Toolkit for LLM Watermarking
MarkLLM: An Open-Source Toolkit for LLM Watermarking
Leyi Pan
Aiwei Liu
Zhiwei He
Zitian Gao
Xuandong Zhao
...
Shuliang Liu
Xuming Hu
Lijie Wen
Irwin King
Philip S. Yu
46
27
0
16 May 2024
LLM Discussion: Enhancing the Creativity of Large Language Models via
  Discussion Framework and Role-Play
LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play
Li-Chun Lu
Shou-Jen Chen
Tsung-Min Pai
Chan-Hung Yu
Hung-yi Lee
Shao-Hua Sun
LLMAG
51
39
0
10 May 2024
Enhancing Creativity in Large Language Models through Associative
  Thinking Strategies
Enhancing Creativity in Large Language Models through Associative Thinking Strategies
Pronita Mehrotra
Aishni Parab
Sumit Gulwani
LRM
31
6
0
09 May 2024
Train & Constrain: Phonologically Informed Tongue-Twister Generation
  from Topics and Paraphrases
Train & Constrain: Phonologically Informed Tongue-Twister Generation from Topics and Paraphrases
Tyler Loakman
Chen Tang
Chenghua Lin
38
4
0
20 Mar 2024
Emerging Opportunities of Using Large Language Models for Translation
  Between Drug Molecules and Indications
Emerging Opportunities of Using Large Language Models for Translation Between Drug Molecules and Indications
David Oniani
Jordan Hilsman
Chengxi Zang
Junmei Wang
Lianjin Cai
Jan Zawala
Yanshan Wang
11
7
0
14 Feb 2024
T-RAG: Lessons from the LLM Trenches
T-RAG: Lessons from the LLM Trenches
M. Fatehkia
J. Lucas
Sanjay Chawla
LLMAG
32
21
0
12 Feb 2024
Personalized Text Generation with Fine-Grained Linguistic Control
Personalized Text Generation with Fine-Grained Linguistic Control
Bashar Alhafni
Vivek Kulkarni
Dhruv Kumar
Vipul Raheja
16
11
0
07 Feb 2024
Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning
  Matches Human Performance in Some Hermeneutic Tasks
Scalable Qualitative Coding with LLMs: Chain-of-Thought Reasoning Matches Human Performance in Some Hermeneutic Tasks
Zackary Dunivin
21
16
0
26 Jan 2024
From Prompt Engineering to Prompt Science With Human in the Loop
From Prompt Engineering to Prompt Science With Human in the Loop
Chirag Shah
34
9
0
01 Jan 2024
Experimental Narratives: A Comparison of Human Crowdsourced Storytelling
  and AI Storytelling
Experimental Narratives: A Comparison of Human Crowdsourced Storytelling and AI Storytelling
Nina Beguš
25
20
0
19 Oct 2023
Co-Writing Screenplays and Theatre Scripts with Language Models: An
  Evaluation by Industry Professionals
Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals
Piotr Wojciech Mirowski
Kory W. Mathewson
Jaylen Pittman
Richard Evans
HAI
55
250
0
29 Sep 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
308
11,915
0
04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
213
1,656
0
15 Oct 2021
The Perils of Using Mechanical Turk to Evaluate Open-Ended Text
  Generation
The Perils of Using Mechanical Turk to Evaluate Open-Ended Text Generation
Marzena Karpinska
Nader Akoury
Mohit Iyyer
204
106
0
14 Sep 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
248
1,986
0
31 Dec 2020
1