Language Model Evaluation Beyond Perplexity

31 May 2021

Papers citing "Language Model Evaluation Beyond Perplexity"

18 / 18 papers shown

Title
Technical report on label-informed logit redistribution for better domain generalization in low-shot classification with foundation models Behraj Khan T. Syed 180 1 0 29 Jan 2025
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models Kunsheng Tang Wenbo Zhou Jie Zhang Aishan Liu Gelei Deng Shuai Li Peigui Qi Weiming Zhang Tianwei Zhang Nenghai Yu 46 3 0 22 Aug 2024
Jailbreaking Text-to-Image Models with LLM-Based Agents Yingkai Dong Zheng Li Xiangtao Meng Ning Yu Shanqing Guo LLMAG 45 13 0 01 Aug 2024
Slaves to the Law of Large Numbers: An Asymptotic Equipartition Property for Perplexity in Generative Language Models Avinash Mudireddy Tyler Bell R. Mudumbai 36 1 0 22 May 2024
Position: Key Claims in LLM Research Have a Long Tail of Footnotes Anna Rogers A. Luccioni 53 19 0 14 Aug 2023
Empowering Cross-lingual Behavioral Testing of NLP Models with Typological Features Ester Hlavnova Sebastian Ruder 35 5 0 11 Jul 2023
ChatGPT in the Age of Generative AI and Large Language Models: A Concise Survey S. Mohamadi G. Mujtaba Ngan Le Gianfranco Doretto Don Adjeroh LM&MA AI4MH 31 21 0 09 Jul 2023
Dreams Are More "Predictable'' Than You Think Lorenzo Bertolini 18 0 0 08 May 2023
Semantic Compression With Large Language Models Henry Gilbert Michael Sandborn Douglas C. Schmidt Jesse Spencer-Smith Jules White 22 22 0 25 Apr 2023
A Natural Bias for Language Generation Models Clara Meister Wojciech Stokowiec Tiago Pimentel Lei Yu Laura Rimell A. Kuncoro MILM 33 6 0 19 Dec 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks Avia Efrat Or Honovich Omer Levy 29 20 0 03 Nov 2022
Truncation Sampling as Language Model Desmoothing John Hewitt Christopher D. Manning Percy Liang BDL 44 76 0 27 Oct 2022
Link the World: Improving Open-domain Conversation with Dynamic Spatiotemporal-aware Knowledge Han Zhou Xinchao Xu Wenquan Wu Zheng-Yu Niu Hua Wu Siqi Bao Fan Wang Haifeng Wang KELM 35 7 0 28 Jun 2022
On the Usefulness of Embeddings, Clusters and Strings for Text Generator Evaluation Tiago Pimentel Clara Meister Ryan Cotterell 48 7 0 31 May 2022
Evaluating Distributional Distortion in Neural Language Modeling Benjamin LeBrun Alessandro Sordoni Timothy J. O'Donnell 22 22 0 24 Mar 2022
Probing BERT's priors with serial reproduction chains Takateru Yamakoshi Thomas Griffiths Robert D. Hawkins 29 12 0 24 Feb 2022
How much do language models copy from their training data? Evaluating linguistic novelty in text generation using RAVEN R. Thomas McCoy P. Smolensky Tal Linzen Jianfeng Gao Asli Celikyilmaz SyDa 25 119 0 18 Nov 2021
Coherence boosting: When your pretrained language model is not paying enough attention Nikolay Malkin Zhen Wang Nebojsa Jojic RALM 21 35 0 15 Oct 2021