Sparse Activation Editing for Reliable Instruction Following in Narratives

22 May 2025

Main:7 Pages

6 Figures

Bibliography:3 Pages

5 Tables

Appendix:5 Pages

Abstract

Complex narrative contexts often challenge language models' ability to follow instructions, and existing benchmarks fail to capture these difficulties. To address this, we propose Concise-SAE, a training-free framework that improves instruction following by identifying and editing instruction-relevant neurons using only natural language instructions, without requiring labelled data. To thoroughly evaluate our method, we introduce FreeInstruct, a diverse and realistic benchmark of 1,212 examples that highlights the challenges of instruction following in narrative-rich settings. While initially motivated by complex narratives, Concise-SAE demonstrates state-of-the-art instruction adherence across varied tasks without compromising generation quality.

View on arXiv

@article{zhao2025_2505.16505,
  title={ Sparse Activation Editing for Reliable Instruction Following in Narratives },
  author={ Runcong Zhao and Chengyu Cao and Qinglin Zhu and Xiucheng Lv and Shun Shao and Lin Gui and Ruifeng Xu and Yulan He },
  journal={arXiv preprint arXiv:2505.16505},
  year={ 2025 }
}

Comments on this paper