Stratified Prediction-Powered Inference for Hybrid Language Model
Evaluation

Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

6 June 2024

William W. Cohen

Papers citing "Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation"

8 / 8 papers shown

Title
Cer-Eval: Certifiable and Cost-Efficient Evaluation Framework for LLMs G. Wang Z. Chen Bo Li Haifeng Xu 126 0 0 02 May 2025
Context-Aware Doubly-Robust Semi-Supervised Learning Clement Ruah Houssem Sifaou Osvaldo Simeone Bashir M. Al-Hashimi 70 0 0 24 Feb 2025
FAB-PPI: Frequentist, Assisted by Bayes, Prediction-Powered Inference Stefano Cortinovis François Caron 76 0 0 04 Feb 2025
Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets Yuxin Wang Maresa Schröder Dennis Frauen J. Schweisthal Konstantin Hess Stefan Feuerriegel CML 72 0 0 16 Dec 2024
Auto-Evaluation with Few Labels through Post-hoc Regression Benjamin Eyre David Madras 75 1 0 19 Nov 2024
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data Florian E. Dorner Vivian Y. Nastl Moritz Hardt ELM ALM 47 5 0 17 Oct 2024
Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control Anastasios Nikolas Angelopoulos Stephen Bates Emmanuel J. Candès Michael I. Jordan Lihua Lei 100 125 0 03 Oct 2021
Distribution-Free, Risk-Controlling Prediction Sets Stephen Bates Anastasios Nikolas Angelopoulos Lihua Lei Jitendra Malik Michael I. Jordan OOD 181 186 0 07 Jan 2021