ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14802
  4. Cited By
Estimating Large Language Model Capabilities without Labeled Test Data
v1v2 (latest)

Estimating Large Language Model Capabilities without Labeled Test Data

24 May 2023
Harvey Yiyun Fu
Qinyuan Ye
Albert Xu
Xiang Ren
Robin Jia
ArXiv (abs)PDFHTML

Papers citing "Estimating Large Language Model Capabilities without Labeled Test Data"

15 / 15 papers shown
Title
Predicting generalization performance with correctness discriminators
Predicting generalization performance with correctness discriminators
Yuekun Yao
Alexander Koller
94
1
0
15 Nov 2023
The False Promise of Imitating Proprietary LLMs
The False Promise of Imitating Proprietary LLMs
Arnav Gudibande
Eric Wallace
Charles Burton Snell
Xinyang Geng
Hao Liu
Pieter Abbeel
Sergey Levine
Dawn Song
ALM
112
205
0
25 May 2023
On the Relation between Sensitivity and Accuracy in In-context Learning
On the Relation between Sensitivity and Accuracy in In-context Learning
Yanda Chen
Chen Zhao
Zhou Yu
Kathleen McKeown
He He
218
80
0
16 Sep 2022
Estimating Model Performance under Domain Shifts with Class-Specific
  Confidence Scores
Estimating Model Performance under Domain Shifts with Class-Specific Confidence Scores
Zeju Li
Konstantinos Kamnitsas
Mobarakol Islam
Chen Chen
Ben Glocker
52
9
0
20 Jul 2022
Language Models (Mostly) Know What They Know
Language Models (Mostly) Know What They Know
Saurav Kadavath
Tom Conerly
Amanda Askell
T. Henighan
Dawn Drain
...
Nicholas Joseph
Benjamin Mann
Sam McCandlish
C. Olah
Jared Kaplan
ELM
117
826
0
11 Jul 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILMLRM
498
6,240
0
05 Apr 2022
Predicting Out-of-Distribution Error with the Projection Norm
Predicting Out-of-Distribution Error with the Projection Norm
Yaodong Yu
Zitong Yang
Alexander Wei
Yi-An Ma
Jacob Steinhardt
OODD
53
44
0
11 Feb 2022
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance
Saurabh Garg
Sivaraman Balakrishnan
Zachary Chase Lipton
Behnam Neyshabur
Hanie Sedghi
OODDOOD
65
130
0
11 Jan 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
342
1,702
0
15 Oct 2021
Datasets: A Community Library for Natural Language Processing
Datasets: A Community Library for Natural Language Processing
Quentin Lhoest
Albert Villanova del Moral
Yacine Jernite
A. Thakur
Patrick von Platen
...
Thibault Goehringer
Victor Mustar
François Lagunas
Alexander M. Rush
Thomas Wolf
216
610
0
07 Sep 2021
Predicting with Confidence on Unseen Distributions
Predicting with Confidence on Unseen Distributions
Devin Guillory
Vaishaal Shankar
Sayna Ebrahimi
Trevor Darrell
Ludwig Schmidt
UQCVOOD
60
122
0
07 Jul 2021
True Few-Shot Learning with Language Models
True Few-Shot Learning with Language Models
Ethan Perez
Douwe Kiela
Kyunghyun Cho
128
437
0
24 May 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in
  NLP
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
286
184
0
18 Apr 2021
Calibration of Pre-trained Transformers
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
291
300
0
17 Mar 2020
Verified Uncertainty Calibration
Verified Uncertainty Calibration
Ananya Kumar
Percy Liang
Tengyu Ma
167
356
0
23 Sep 2019
1