ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.04291
  4. Cited By
Stratified Prediction-Powered Inference for Hybrid Language Model
  Evaluation

Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

6 June 2024
Adam Fisch
Joshua Maynez
R. A. Hofer
Bhuwan Dhingra
Amir Globerson
William W. Cohen
ArXivPDFHTML

Papers citing "Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation"

8 / 8 papers shown
Title
Cer-Eval: Certifiable and Cost-Efficient Evaluation Framework for LLMs
Cer-Eval: Certifiable and Cost-Efficient Evaluation Framework for LLMs
G. Wang
Z. Chen
Bo Li
Haifeng Xu
126
0
0
02 May 2025
Context-Aware Doubly-Robust Semi-Supervised Learning
Context-Aware Doubly-Robust Semi-Supervised Learning
Clement Ruah
Houssem Sifaou
Osvaldo Simeone
Bashir M. Al-Hashimi
70
0
0
24 Feb 2025
FAB-PPI: Frequentist, Assisted by Bayes, Prediction-Powered Inference
FAB-PPI: Frequentist, Assisted by Bayes, Prediction-Powered Inference
Stefano Cortinovis
François Caron
76
0
0
04 Feb 2025
Constructing Confidence Intervals for Average Treatment Effects from
  Multiple Datasets
Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets
Yuxin Wang
Maresa Schröder
Dennis Frauen
J. Schweisthal
Konstantin Hess
Stefan Feuerriegel
CML
72
0
0
16 Dec 2024
Auto-Evaluation with Few Labels through Post-hoc Regression
Auto-Evaluation with Few Labels through Post-hoc Regression
Benjamin Eyre
David Madras
75
1
0
19 Nov 2024
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
Limits to scalable evaluation at the frontier: LLM as Judge won't beat twice the data
Florian E. Dorner
Vivian Y. Nastl
Moritz Hardt
ELM
ALM
47
5
0
17 Oct 2024
Learn then Test: Calibrating Predictive Algorithms to Achieve Risk
  Control
Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control
Anastasios Nikolas Angelopoulos
Stephen Bates
Emmanuel J. Candès
Michael I. Jordan
Lihua Lei
100
125
0
03 Oct 2021
Distribution-Free, Risk-Controlling Prediction Sets
Distribution-Free, Risk-Controlling Prediction Sets
Stephen Bates
Anastasios Nikolas Angelopoulos
Lihua Lei
Jitendra Malik
Michael I. Jordan
OOD
181
186
0
07 Jan 2021
1