ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.06077
  4. Cited By
Fly-Swat or Cannon? Cost-Effective Language Model Choice via
  Meta-Modeling

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

11 August 2023
Marija vSakota
Maxime Peyrard
Robert West
ArXivPDFHTML

Papers citing "Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling"

37 / 37 papers shown
Title
Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory
Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory
Yexiang Liu
Zekun Li
Zhi Fang
Nan Xu
Ran He
Tieniu Tan
LRM
17
0
0
16 May 2025
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering
Jihao Zhao
Chunlai Zhou
Biao Qin
55
0
0
05 May 2025
COSMOS: Predictable and Cost-Effective Adaptation of LLMs
COSMOS: Predictable and Cost-Effective Adaptation of LLMs
Jiayu Wang
Aws Albarghouthi
Frederic Sala
52
0
0
30 Apr 2025
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Mirian Hipolito Garcia
Camille Couturier
Daniel Madrigal Diaz
Ankur Mallick
Anastasios Kyrillidis
Robert Sim
Victor Rühle
Saravan Rajmohan
30
0
0
23 Apr 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
44
0
0
13 Mar 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
LRM
54
2
0
08 Mar 2025
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble
Zhijun Chen
Jingzheng Li
Pengpeng Chen
Zhuoran Li
Kai Sun
Yuankai Luo
Qianren Mao
Dingqi Yang
Hailong Sun
Philip S. Yu
ELM
55
5
0
25 Feb 2025
Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral
Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral
António Farinhas
Nuno M. Guerreiro
Sweta Agrawal
Ricardo Rei
André F. T. Martins
53
0
0
18 Feb 2025
A Unified Approach to Routing and Cascading for LLMs
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck
Maximilian Baader
Martin Vechev
60
2
0
17 Feb 2025
MixLLM: Dynamic Routing in Mixed Large Language Models
MixLLM: Dynamic Routing in Mixed Large Language Models
Xinyuan Wang
Yanchi Liu
Wei Cheng
Xujiang Zhao
Zhengzhang Chen
Wenchao Yu
Yanjie Fu
Haifeng Chen
57
3
0
09 Feb 2025
PickLLM: Context-Aware RL-Assisted Large Language Model Routing
PickLLM: Context-Aware RL-Assisted Large Language Model Routing
Dimitrios Sikeridis
Dennis Ramdass
Pranay Pareek
86
1
0
12 Dec 2024
Smoothie: Label Free Language Model Routing
Smoothie: Label Free Language Model Routing
Neel Guha
Mayee F. Chen
Trevor Chow
Ishan S. Khare
Christopher Ré
71
4
0
06 Dec 2024
Plug-and-Play Performance Estimation for LLM Services without Relying on
  Labeled Data
Plug-and-Play Performance Estimation for LLM Services without Relying on Labeled Data
Can Wang
Dianbo Sui
Hongliang Sun
Hao Ding
Bolin Zhang
Zhiying Tu
29
0
0
10 Oct 2024
GraphRouter: A Graph-based Router for LLM Selections
GraphRouter: A Graph-based Router for LLM Selections
Tao Feng
Yanzhen Shen
Jiaxuan You
85
10
0
04 Oct 2024
Efficiently Deploying LLMs with Controlled Risk
Efficiently Deploying LLMs with Controlled Risk
Michael J. Zellinger
Matt Thomson
41
1
0
03 Oct 2024
What is the Role of Small Models in the LLM Era: A Survey
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
63
23
0
10 Sep 2024
End User Authoring of Personalized Content Classifiers: Comparing
  Example Labeling, Rule Writing, and LLM Prompting
End User Authoring of Personalized Content Classifiers: Comparing Example Labeling, Rule Writing, and LLM Prompting
Leijie Wang
Kathryn Yurechko
Pranati Dani
Quan Ze Chen
Amy X. Zhang
50
3
0
05 Sep 2024
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large
  Language Models
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models
Zhongyu Zhao
Menghang Dong
Rongyu Zhang
Wenzhao Zheng
Yunpeng Zhang
Huanrui Yang
Dalong Du
Kurt Keutzer
Shanghang Zhang
51
0
0
15 Aug 2024
Logistic Regression makes small LLMs strong and explainable
  "tens-of-shot" classifiers
Logistic Regression makes small LLMs strong and explainable "tens-of-shot" classifiers
Marcus Buckmann
Edward Hill
40
2
0
06 Aug 2024
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs
Quang H. Nguyen
Duy C. Hoang
Juliette Decugis
Saurav Manchanda
Nitesh V. Chawla
Khoa D. Doan
Khoa D. Doan
45
6
0
15 Jul 2024
Cascade-Aware Training of Language Models
Cascade-Aware Training of Language Models
Congchao Wang
Sean Augenstein
Keith Rush
Wittawat Jitkrittum
Harikrishna Narasimhan
A. S. Rawat
A. Menon
Alec Go
36
4
0
29 May 2024
Cost-efficient Knowledge-based Question Answering with Large Language
  Models
Cost-efficient Knowledge-based Question Answering with Large Language Models
Junnan Dong
Qinggang Zhang
Chuang Zhou
Hao Chen
Daochen Zha
Xiao Huang
33
3
0
27 May 2024
OptLLM: Optimal Assignment of Queries to Large Language Models
OptLLM: Optimal Assignment of Queries to Large Language Models
Yueyue Liu
Hongyu Zhang
Yuantian Miao
Van-Hoang Le
Zhiqiang Li
21
2
0
24 May 2024
Optimising Calls to Large Language Models with Uncertainty-Based
  Two-Tier Selection
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection
Guillem Ramírez
Alexandra Birch
Ivan Titov
40
8
0
03 May 2024
Language Model Cascades: Token-level uncertainty and beyond
Language Model Cascades: Token-level uncertainty and beyond
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
53
42
0
15 Apr 2024
RouterBench: A Benchmark for Multi-LLM Routing System
RouterBench: A Benchmark for Multi-LLM Routing System
Qitian Jason Hu
Jacob Bieker
Xiuyu Li
Nan Jiang
Benjamin Keigwin
Gaurav Ranganath
Kurt Keutzer
Shriyash Kaustubh Upadhyay
44
36
0
18 Mar 2024
Are More LLM Calls All You Need? Towards Scaling Laws of Compound
  Inference Systems
Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
Lingjiao Chen
Jared Quincy Davis
Boris Hanin
Peter Bailis
Ion Stoica
Matei A. Zaharia
James Zou
LRM
29
0
0
04 Mar 2024
EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models
EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models
Muhammad Shihab Rashid
Jannat Ara Meem
Yue Dong
Vagelis Hristidis
LRM
36
0
0
16 Feb 2024
IoT in the Era of Generative AI: Vision and Challenges
IoT in the Era of Generative AI: Vision and Challenges
Xin Wang
Zhongwei Wan
Arvin Hekmati
M. Zong
Samiul Alam
Mi Zhang
Bhaskar Krishnamachari
32
15
0
03 Jan 2024
OrchestraLLM: Efficient Orchestration of Language Models for Dialogue
  State Tracking
OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking
Chia-Hsuan Lee
Hao Cheng
Mari Ostendorf
47
4
0
16 Nov 2023
Cache & Distil: Optimising API Calls to Large Language Models
Cache & Distil: Optimising API Calls to Large Language Models
Guillem Ramírez
Matthias Lindemann
Alexandra Birch
Ivan Titov
40
3
0
20 Oct 2023
AutoMix: Automatically Mixing Language Models
AutoMix: Automatically Mixing Language Models
Pranjal Aggarwal
Aman Madaan
Ankit Anand
Srividya Pranavi Potharaju
Swaroop Mishra
...
Karthik Kappaganthu
Yiming Yang
Shyam Upadhyay
Manaal Faruqui
Mausam
42
17
0
19 Oct 2023
Stranger Danger! Cross-Community Interactions with Fringe Users Increase
  the Growth of Fringe Communities on Reddit
Stranger Danger! Cross-Community Interactions with Fringe Users Increase the Growth of Fringe Communities on Reddit
Giuseppe Russo
Manoel Horta Ribeiro
Robert West
24
10
0
18 Oct 2023
Large Language Model Cascades with Mixture of Thoughts Representations
  for Cost-efficient Reasoning
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning
Murong Yue
Jie Zhao
Min Zhang
Liang Du
Ziyu Yao
LRM
35
55
0
04 Oct 2023
Towards Efficient Post-training Quantization of Pre-trained Language
  Models
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
M. Lyu
MQ
79
47
0
30 Sep 2021
Sparsity in Deep Learning: Pruning and growth for efficient inference
  and training in neural networks
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
141
684
0
31 Jan 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
1