v1v2 (latest)

Detect, Explain, Escalate: Low-Carbon Dialogue Breakdown Management for LLM-Powered Agents

26 April 2025

Main:9 Pages

4 Figures

Bibliography:2 Pages

1 Tables

Appendix:2 Pages

Abstract

While Large Language Models (LLMs) are transforming numerous applications, their susceptibility to conversational breakdowns remains a critical challenge undermining user trust. This paper introduces a "Detect, Explain, Escalate" framework to manage dialogue breakdowns in LLM-powered agents, emphasizing low-carbon operation. Our approach integrates two key strategies: (1) We fine-tune a compact 8B-parameter model, augmented with teacher-generated reasoning traces, which serves as an efficient real-time breakdown 'detector' and éxplainer'. This model demonstrates robust classification and calibration on English and Japanese dialogues, and generalizes well to the BETOLD dataset, improving accuracy by 7% over its baseline. (2) We systematically evaluate frontier LLMs using advanced prompting (few-shot, chain-of-thought, analogical reasoning) for high-fidelity breakdown assessment. These are integrated into an éscalation' architecture where our efficient detector defers to larger models only when necessary, substantially reducing operational costs and energy consumption. Our fine-tuned model and prompting strategies establish new state-of-the-art results on dialogue breakdown detection benchmarks, outperforming specialized classifiers and significantly narrowing the performance gap to larger proprietary models. The proposed monitor-escalate pipeline reduces inference costs by 54%, offering a scalable, efficient, and more interpretable solution for robust conversational AI in high-impact domains. Code and models will be publicly released.

View on arXiv

@article{ghassel2025_2504.18839,
  title={ Detect, Explain, Escalate: Low-Carbon Dialogue Breakdown Management for LLM-Powered Agents },
  author={ Abdellah Ghassel and Xianzhi Li and Xiaodan Zhu },
  journal={arXiv preprint arXiv:2504.18839},
  year={ 2025 }
}

Comments on this paper