Beyond LLMs: A Linguistic Approach to Causal Graph Generation from Narrative Texts

10 April 2025

Abstract

We propose a novel framework for generating causal graphs from narrative texts, bridging high-level causality and detailed event-specific relationships. Our method first extracts concise, agent-centered vertices using large language model (LLM)-based summarization. We introduce an "Expert Index," comprising seven linguistically informed features, integrated into a Situation-Task-Action-Consequence (STAC) classification model. This hybrid system, combining RoBERTa embeddings with the Expert Index, achieves superior precision in causal link identification compared to pure LLM-based approaches. Finally, a structured five-iteration prompting process refines and constructs connected causal graphs. Experiments on 100 narrative chapters and short stories demonstrate that our approach consistently outperforms GPT-4o and Claude 3.5 in causal graph quality, while maintaining readability. The open-source tool provides an interpretable, efficient solution for capturing nuanced causal chains in narratives.

View on arXiv

@article{li2025_2504.07459,
  title={ Beyond LLMs: A Linguistic Approach to Causal Graph Generation from Narrative Texts },
  author={ Zehan Li and Ruhua Pan and Xinyu Pi },
  journal={arXiv preprint arXiv:2504.07459},
  year={ 2025 }
}

Comments on this paper