Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.02561
Cited By
v1
v2
v3 (latest)
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion
5 June 2023
Dongfu Jiang
Xiang Ren
Bill Yuchen Lin
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion"
50 / 240 papers shown
Title
TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization
Mingkang Zhu
Xi Chen
Zhongdao Wang
Bei Yu
Hengshuang Zhao
Jiaya Jia
17
0
0
17 Jun 2025
TagRouter: Learning Route to LLMs through Tags for Open-Domain Text Generation Tasks
Zhou Chen
Zhiqiang Wei
Yuqi Bai
Xue Xiong
Jianmin Wu
3DV
16
0
0
14 Jun 2025
Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
Shubhashis Roy Dipta
Francis Ferraro
51
0
0
11 Jun 2025
Leveraging LLMs to Evaluate Usefulness of Document
Xingzhu Wang
Erhan Zhang
Yiqun Chen
Jinghan Xuan
Yucheng Hou
Yitong Xu
Ying Nie
Shuaiqiang Wang
Dawei Yin
Jiaxin Mao
56
0
0
10 Jun 2025
MEMETRON: Metaheuristic Mechanisms for Test-time Response Optimization of Large Language Models
S. Nguyen
Theja Tulabandhula
25
0
0
10 Jun 2025
Federated In-Context Learning: Iterative Refinement for Improved Answer Quality
Ruhan Wang
Zhiyong Wang
Chengkai Huang
Rui Wang
Tong Yu
Lina Yao
John C. S. Lui
Dongruo Zhou
17
0
0
09 Jun 2025
Debiasing Online Preference Learning via Preference Feature Preservation
Dongyoung Kim
Jinsung Yoon
Jinwoo Shin
Jaehyung Kim
17
0
0
06 Jun 2025
Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques
Adarsh Prasad Behera
J. Champati
Roberto Morabito
Sasu Tarkoma
J. Gross
25
0
0
06 Jun 2025
SPARTA ALIGNMENT: Collectively Aligning Multiple Language Models through Combat
Yuru Jiang
Wenxuan Ding
Shangbin Feng
Greg Durrett
Yulia Tsvetkov
88
0
0
05 Jun 2025
RewardAnything: Generalizable Principle-Following Reward Models
Zhuohao Yu
Jiali Zeng
Weizheng Gu
Yidong Wang
Jindong Wang
Fandong Meng
Jie Zhou
Yue Zhang
Shikun Zhang
Wei Ye
LRM
109
1
0
04 Jun 2025
RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing
Ruihan Jin
Pengpeng Shao
Zhengqi Wen
Jinyang Wu
Mingkuan Feng
Shuai Zhang
Jianhua Tao
58
0
0
04 Jun 2025
Adaptive Graph Pruning for Multi-Agent Communication
Boyi Li
Zhonghan Zhao
Der-Horng Lee
Gaoang Wang
LLMAG
45
0
0
03 Jun 2025
One for All: Update Parameterized Knowledge Across Multiple Models
Weitao Ma
Xiyuan Du
Xiaocheng Feng
L. Huang
Yichong Huang
...
Xiaoliang Yang
Baohang Li
Xiachong Feng
Ting Liu
Bing Qin
KELM
53
0
0
01 Jun 2025
RLAE: Reinforcement Learning-Assisted Ensemble for LLMs
Y. Fu
Yuanheng Zhu
Jiajun Chai
Guojun Yin
Wei Lin
Qichao Zhang
Dongbin Zhao
25
0
0
31 May 2025
Writing-Zero: Bridge the Gap Between Non-verifiable Tasks and Verifiable Rewards
Xun Lu
Yunyi Yang
Yongbo Gai
Kai Luo
Shihao Huang
Jianhe Lin
Xiaoxi Jiang
Guanjun Jiang
52
0
0
30 May 2025
How Much Backtracking is Enough? Exploring the Interplay of SFT and RL in Enhancing LLM Reasoning
Hongyi Cai
Junlin Wang
Xiaoyin Chen
Bhuwan Dhingra
LRM
24
0
0
30 May 2025
SDPO: Importance-Sampled Direct Preference Optimization for Stable Diffusion Training
Xiaomeng Yang
Zhiyu Tan
Junyan Wang
Zhijian Zhou
Hao Li
75
0
0
28 May 2025
Long Context Scaling: Divide and Conquer via Multi-Agent Question-driven Collaboration
Sibo Xiao
Zixin Lin
Wenyang Gao
Yue Zhang
LLMAG
60
0
0
27 May 2025
Fundamental Limits of Game-Theoretic LLM Alignment: Smith Consistency and Preference Matching
Zhekun Shi
Kaizhao Liu
Qi Long
Weijie J. Su
Jiancong Xiao
43
2
0
27 May 2025
The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants
Yiqun Zhang
Hao Li
Chenxu Wang
L. Chen
Qiaosheng Zhang
...
Xinrun Wang
Jia Xu
Lei Bai
Wanli Ouyang
Shuyue Hu
79
0
0
26 May 2025
Dynamically Learned Test-Time Model Routing in Language Model Zoos with Service Level Guarantees
Herbert Woisetschläger
Ryan Zhang
Shiqiang Wang
Hans-Arno Jacobsen
39
0
0
26 May 2025
Autocomp: LLM-Driven Code Optimization for Tensor Accelerators
Charles Hong
Sahil Bhatia
Alvin Cheung
Y. Shao
69
1
0
24 May 2025
Universal Biological Sequence Reranking for Improved De Novo Peptide Sequencing
Zijie Qiu
Jiaqi Wei
Xiang Zhang
Sheng Xu
Kai Zou
Zhi Jin
Zhiqiang Gao
Nanqing Dong
S. Sun
BDL
88
2
0
23 May 2025
INFERENCEDYNAMICS: Efficient Routing Across LLMs through Structured Capability and Knowledge Profiling
Haochen Shi
Tianshi Zheng
Weiqi Wang
Baixuan Xu
Chunyang Li
Chunkit Chan
Tao Fan
Yangqiu Song
Qiang Yang
95
1
0
22 May 2025
Optimizing LLM-Based Multi-Agent System with Textual Feedback: A Case Study on Software Development
Ming Shen
Raphael Shu
Anurag Pratik
James Gung
Yubin Ge
Monica Sunkara
Yi Zhang
LLMAG
63
0
0
22 May 2025
LightRouter: Towards Efficient LLM Collaboration with Minimal Overhead
Yifan Zhang
Xinkui Zhao
Zuxin Wang
Guanjie Cheng
Yueshen Xu
Shuiguang Deng
Yuxiang Cai
93
0
0
22 May 2025
X-MAS: Towards Building Multi-Agent Systems with Heterogeneous LLMs
Rui Ye
Xiangrui Liu
Qimin Wu
Xianghe Pang
Zhenfei Yin
Lei Bai
Siheng Chen
LLMAG
81
0
0
22 May 2025
Bayesian Optimization for Enhanced Language Models: Optimizing Acquisition Functions
Zishuo Bao
Yibo Liu
Changyutao Qiu
216
0
0
22 May 2025
In-Domain African Languages Translation Using LLMs and Multi-armed Bandits
Pratik Rakesh Singh
Kritarth Prasad
Mohammadi Zaki
Pankaj Wasnik
39
0
0
21 May 2025
Learnware of Language Models: Specialized Small Language Models Can Do Big
Zhi-Hao Tan
Zi-Chen Zhao
Hao-Yu Shi
Xin-Yu Zhang
Peng Tan
Yang Yu
Zhi Zhou
132
0
0
19 May 2025
Investigating the Vulnerability of LLM-as-a-Judge Architectures to Prompt-Injection Attacks
Narek Maloyan
Bislan Ashinov
Dmitry Namiot
AAML
ELM
85
0
0
19 May 2025
MR. Judge: Multimodal Reasoner as a Judge
Renjie Pi
Felix Bai
Qibin Chen
Simon Wang
Jiulong Shan
Kieran Liu
Meng Cao
ELM
LRM
120
0
0
19 May 2025
OMAC: A Broad Optimization Framework for LLM-Based Multi-Agent Collaboration
Shijun Li
Hilaf Hasson
Joydeep Ghosh
LLMAG
112
0
0
17 May 2025
InfoPO: On Mutual Information Maximization for Large Language Model Alignment
Teng Xiao
Zhen Ge
Sujay Sanghavi
Tian Wang
Julian Katz-Samuels
Marc Versage
Qingjun Cui
Trishul Chilimbi
203
1
0
13 May 2025
Internet of Agents: Fundamentals, Applications, and Challenges
Yuntao Wang
Shaolong Guo
Yanghe Pan
Zhou Su
Fahao Chen
Tom H. Luan
Peng Li
Jiawen Kang
Dusit Niyato
LLMAG
LM&Ro
AI4CE
150
1
0
12 May 2025
On the Robustness of Reward Models for Language Model Alignment
Jiwoo Hong
Noah Lee
Eunki Kim
Guijin Son
Woojin Chung
Aman Gupta
Shao Tang
James Thorne
99
0
0
12 May 2025
MedSyn: Enhancing Diagnostics with Human-AI Collaboration
Burcu Sayin
Ipek Baris Schlicht
Ngoc Vo Hong
Sara Allievi
Jacopo Staiano
Pasquale Minervini
Andrea Passerini
LM&MA
28
0
0
07 May 2025
Position: Enough of Scaling LLMs! Lets Focus on Downscaling
Ayan Sengupta
Ayan Sengupta
Tanmoy Chakraborty
112
0
0
02 May 2025
DNB-AI-Project at SemEval-2025 Task 5: An LLM-Ensemble Approach for Automated Subject Indexing
Lisa Kluge
Maximilian Kähler
394
1
0
30 Apr 2025
Ensemble Bayesian Inference: Leveraging Small Language Models to Achieve LLM-level Accuracy in Profile Matching Tasks
Haru-Tada Sato
Fuka Matsuzaki
Jun-ichiro Takahashi
UQCV
76
0
0
24 Apr 2025
Exploring How LLMs Capture and Represent Domain-Specific Knowledge
Mirian Hipolito Garcia
Camille Couturier
Daniel Madrigal Diaz
Ankur Mallick
Anastasios Kyrillidis
Robert Sim
Victor Rühle
Saravan Rajmohan
72
1
0
23 Apr 2025
FlowReasoner: Reinforcing Query-Level Meta-Agents
Hongcheng Gao
Yue Liu
Yufei He
Longxu Dou
C. Du
Zhijie Deng
Bryan Hooi
Min Lin
Tianyu Pang
AIFin
LRM
112
4
0
21 Apr 2025
Synergistic Weak-Strong Collaboration by Aligning Preferences
Yizhu Jiao
Xuchao Zhang
Zhaoyang Wang
Yubo Ma
Zhun Deng
Rujia Wang
Chetan Bansal
Saravan Rajmohan
Jiawei Han
Huaxiu Yao
481
0
0
21 Apr 2025
Q-FAKER: Query-free Hard Black-box Attack via Controlled Generation
CheolWon Na
YunSeok Choi
Jee-Hyong Lee
AAML
71
0
0
18 Apr 2025
Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR
Yize Zhang
Tianyi Liang
Xinyue Huang
Erfei Cui
Xu Guo
Pei Chu
Chenhui Li
Ru Zhang
Wenhai Wang
Gongshen Liu
348
0
0
15 Apr 2025
Learning to Be A Doctor: Searching for Effective Medical Agent Architectures
Yangyang Zhuang
Wenjia Jiang
Jing Zhang
Ze Yang
Qiufeng Wang
Chi Zhang
AI4CE
76
0
0
15 Apr 2025
CHARM: Calibrating Reward Models With Chatbot Arena Scores
Xiao Zhu
Chenmien Tan
Pinzhen Chen
Rico Sennrich
Yanlin Zhang
Hanxu Hu
ALM
118
1
0
14 Apr 2025
EMAFusion: A Self-Optimizing System for Seamless LLM Selection and Integration
Soham Shah
Kumar Shridhar
Surojit Chatterjee
Souvik Sen
86
0
0
14 Apr 2025
Metropolis-Hastings Captioning Game: Knowledge Fusion of Vision Language Models via Decentralized Bayesian Inference
Yuta Matsui
Ryosuke Yamaki
Ryo Ueda
Seitaro Shinagawa
Tadahiro Taniguchi
MLLM
105
1
0
13 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
124
11
0
12 Apr 2025
1
2
3
4
5
Next