ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.16505
  4. Cited By
Sparse Activation Editing for Reliable Instruction Following in Narratives

Sparse Activation Editing for Reliable Instruction Following in Narratives

22 May 2025
Runcong Zhao
Chengyu Cao
Qinglin Zhu
Xiucheng Lv
Shun Shao
Lin Gui
Ruifeng Xu
Yulan He
ArXiv (abs)PDFHTML

Papers citing "Sparse Activation Editing for Reliable Instruction Following in Narratives"

4 / 4 papers shown
Title
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models
Z. He
Haiyan Zhao
Yiran Qiao
Fan Yang
Ali Payani
Jing Ma
Jundong Li
LLMSV
121
9
0
17 Feb 2025
RoleMRC: A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following
RoleMRC: A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following
Junru Lu
Jiazheng Li
Guodong Shen
Lin Gui
Siyu An
Yulan He
Di Yin
Xing Sun
53
1
0
17 Feb 2025
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Yu Zhao
Alessio Devoto
Giwon Hong
Xiaotang Du
Aryo Pradipta Gema
Hongru Wang
Xuanli He
Kam-Fai Wong
Pasquale Minervini
KELMLLMSV
134
28
0
21 Oct 2024
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Samuel Marks
Can Rager
Eric J. Michaud
Yonatan Belinkov
David Bau
Aaron Mueller
173
159
0
28 Mar 2024
1