Adversarial Multi-Agent Evaluation of Large Language Models through Iterative Debates

7 October 2024

Chaithanya Bandi

Hari Bandi

Abir Harrasse

LLMAG

ELM

ArXiv (abs)PDF HTML

Abstract

This paper explores optimal architectures for evaluating the outputs of large language models (LLMs) using LLMs themselves. We propose a novel framework that interprets LLMs as advocates within an ensemble of interacting agents, allowing them to defend their answers and reach conclusions through a judge and jury system. This approach offers a more dynamic and comprehensive evaluation process compared to traditional human-based assessments or automated metrics. We discuss the motivation behind this framework, its key components, and comparative advantages. We also present a probabilistic model to evaluate the error reduction achieved by iterative advocate systems. Finally, we outline experiments to validate the effectiveness of multi-advocate architectures and discuss future research directions.

View on arXiv

Comments on this paper