ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.15801
39
0

An explainable transformer circuit for compositional generalization

19 February 2025
Cheng Tang
Brenden Lake
Mehrdad Jazayeri
    LRM
ArXivPDFHTML
Abstract

Compositional generalization-the systematic combination of known components into novel structures-remains a core challenge in cognitive science and machine learning. Although transformer-based large language models can exhibit strong performance on certain compositional tasks, the underlying mechanisms driving these abilities remain opaque, calling into question their interpretability. In this work, we identify and mechanistically interpret the circuit responsible for compositional induction in a compact transformer. Using causal ablations, we validate the circuit and formalize its operation using a program-like description. We further demonstrate that this mechanistic understanding enables precise activation edits to steer the model's behavior predictably. Our findings advance the understanding of complex behaviors in transformers and highlight such insights can provide a direct pathway for model control.

View on arXiv
@article{tang2025_2502.15801,
  title={ An explainable transformer circuit for compositional generalization },
  author={ Cheng Tang and Brenden Lake and Mehrdad Jazayeri },
  journal={arXiv preprint arXiv:2502.15801},
  year={ 2025 }
}
Comments on this paper