"Seeing the Big through the Small": Can LLMs Approximate Human Judgment
Distributions on NLI from a Few Explanations?

"Seeing the Big through the Small": Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations?

25 June 2024

Robert Litschko

Anna Korhonen

Barbara Plank

Papers citing ""Seeing the Big through the Small": Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations?"

9 / 9 papers shown

Title
Training and Evaluating with Human Label Variation: An Empirical Study Kemal Kurniawan Meladel Mistica Timothy Baldwin Jey Han Lau 85 1 0 03 Feb 2025
Beyond correlation: The Impact of Human Uncertainty in Measuring the Effectiveness of Automatic Evaluation and LLM-as-a-Judge Aparna Elangovan Jongwoo Ko Lei Xu Mahsa Elyasi Ling Liu S. Bodapati Dan Roth 75 6 0 28 Jan 2025
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Pat Verga Sebastian Hofstatter Sophia Althammer Yixuan Su Aleksandra Piktus Arkady Arkhangorodsky Minjie Xu Naomi White Patrick Lewis ALM ELM 75 96 0 29 Apr 2024
Stop Measuring Calibration When Humans Disagree Joris Baan Wilker Aziz Barbara Plank Raquel Fernández 45 54 0 28 Oct 2022
Mitigating Neural Network Overconfidence with Logit Normalization Hongxin Wei Renchunzi Xie Hao-Ran Cheng Lei Feng Bo An Yixuan Li OODD 210 277 0 19 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 598 9,009 0 28 Jan 2022
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 426 24,160 0 26 Jul 2019
When Does Label Smoothing Help? Rafael Müller Simon Kornblith Geoffrey E. Hinton UQCV 145 1,931 0 06 Jun 2019
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference Adina Williams Nikita Nangia Samuel R. Bowman 434 4,444 0 18 Apr 2017