ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.15789
  4. Cited By
Large Language Model Routing with Benchmark Datasets

Large Language Model Routing with Benchmark Datasets

27 September 2023
Tal Shnitzer
Anthony Ou
Mírian Silva
Kate Soule
Yuekai Sun
Justin Solomon
Neil Thompson
Mikhail Yurochkin
    RALM
ArXiv (abs)PDFHTML

Papers citing "Large Language Model Routing with Benchmark Datasets"

29 / 29 papers shown
Title
Synergistic Weak-Strong Collaboration by Aligning Preferences
Synergistic Weak-Strong Collaboration by Aligning Preferences
Yizhu Jiao
Xuchao Zhang
Zhaoyang Wang
Yubo Ma
Zhun Deng
Rujia Wang
Chetan Bansal
Saravan Rajmohan
Jiawei Han
Huaxiu Yao
443
0
0
21 Apr 2025
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Jianhao Chen
Zishuo Xun
Bocheng Zhou
Han Qi
Qiaosheng Zhang
...
Wei Hu
Yuzhong Qu
W. Ouyang
Wanli Ouyang
Shuyue Hu
134
2
0
01 Apr 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
LRM
176
7
0
08 Mar 2025
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble
Zhijun Chen
Jingzheng Li
Pengpeng Chen
Zhuoran Li
Kai Sun
Yuankai Luo
Qianren Mao
Dingqi Yang
Hailong Sun
Philip S. Yu
ELM
112
15
0
25 Feb 2025
Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing
Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing
Yi-Kai Zhang
De-Chuan Zhan
Han-Jia Ye
ALMELMLRM
187
4
0
24 Feb 2025
A Unified Approach to Routing and Cascading for LLMs
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck
Maximilian Baader
Martin Vechev
117
2
0
14 Oct 2024
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling
Yuxuan Yao
Han Wu
Mingyang Liu
Sichun Luo
Xiongwei Han
Jie Liu
Zhijiang Guo
Linqi Song
93
7
0
03 Oct 2024
What is the Role of Small Models in the LLM Era: A Survey
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
177
29
0
10 Sep 2024
SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models
SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models
Kaushal Kumar Maurya
KV Aditya Srivatsa
Ekaterina Kochmar
70
2
0
16 Aug 2024
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs
Quang H. Nguyen
Duy C. Hoang
Juliette Decugis
Saurav Manchanda
Nitesh Chawla
Khoa D. Doan
Khoa D. Doan
198
10
0
15 Jul 2024
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and
  Generative Fusion
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
Dongfu Jiang
Xiang Ren
Bill Yuchen Lin
ELM
73
320
0
05 Jun 2023
FrugalGPT: How to Use Large Language Models While Reducing Cost and
  Improving Performance
FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance
Lingjiao Chen
Matei A. Zaharia
James Zou
LLMAG
165
240
0
09 May 2023
Predicting Out-of-Distribution Error with the Projection Norm
Predicting Out-of-Distribution Error with the Projection Norm
Yaodong Yu
Zitong Yang
Alexander Wei
Yi-An Ma
Jacob Steinhardt
OODD
53
44
0
11 Feb 2022
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance
Saurabh Garg
Sivaraman Balakrishnan
Zachary Chase Lipton
Behnam Neyshabur
Hanie Sedghi
OODDOOD
65
130
0
11 Jan 2022
Predicting with Confidence on Unseen Distributions
Predicting with Confidence on Unseen Distributions
Devin Guillory
Vaishaal Shankar
Sayna Ebrahimi
Trevor Darrell
Ludwig Schmidt
UQCVOOD
57
122
0
07 Jul 2021
Mandoline: Model Evaluation under Distribution Shift
Mandoline: Model Evaluation under Distribution Shift
Mayee F. Chen
Karan Goel
N. Sohoni
Fait Poms
Kayvon Fatahalian
Christopher Ré
67
72
0
01 Jul 2021
Detecting Errors and Estimating Accuracy on Unlabeled Data with
  Self-training Ensembles
Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles
Jiefeng Chen
Frederick Liu
Besim Avci
Xi Wu
Yingyu Liang
S. Jha
63
63
0
29 Jun 2021
Assessing Generalization of SGD via Disagreement
Assessing Generalization of SGD via Disagreement
Yiding Jiang
Vaishnavh Nagarajan
Christina Baek
J. Zico Kolter
95
114
0
25 Jun 2021
BARTScore: Evaluating Generated Text as Text Generation
BARTScore: Evaluating Generated Text as Text Generation
Weizhe Yuan
Graham Neubig
Pengfei Liu
107
843
0
22 Jun 2021
WILDS: A Benchmark of in-the-Wild Distribution Shifts
WILDS: A Benchmark of in-the-Wild Distribution Shifts
Pang Wei Koh
Shiori Sagawa
Henrik Marklund
Sang Michael Xie
Marvin Zhang
...
A. Kundaje
Emma Pierson
Sergey Levine
Chelsea Finn
Percy Liang
OOD
183
1,434
0
14 Dec 2020
In Search of Lost Domain Generalization
In Search of Lost Domain Generalization
Ishaan Gulrajani
David Lopez-Paz
OOD
79
1,149
0
02 Jul 2020
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,226
0
27 Aug 2019
Invariant Risk Minimization
Invariant Risk Minimization
Martín Arjovsky
Léon Bottou
Ishaan Gulrajani
David Lopez-Paz
OOD
192
2,229
0
05 Jul 2019
Can You Trust Your Model's Uncertainty? Evaluating Predictive
  Uncertainty Under Dataset Shift
Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
Yaniv Ovadia
Emily Fertig
Jie Jessie Ren
Zachary Nado
D. Sculley
Sebastian Nowozin
Joshua V. Dillon
Balaji Lakshminarayanan
Jasper Snoek
UQCV
170
1,695
0
06 Jun 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language
  Understanding Systems
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
265
2,315
0
02 May 2019
BERTScore: Evaluating Text Generation with BERT
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
326
5,845
0
21 Apr 2019
Model Evaluation, Model Selection, and Algorithm Selection in Machine
  Learning
Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning
S. Raschka
121
782
0
13 Nov 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,182
0
20 Apr 2018
Deep CORAL: Correlation Alignment for Deep Domain Adaptation
Deep CORAL: Correlation Alignment for Deep Domain Adaptation
Baochen Sun
Kate Saenko
OOD
103
3,156
0
06 Jul 2016
1