ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.08080
  4. Cited By
Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders

Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders

12 May 2025
Dong Shu
Xuansheng Wu
Haiyan Zhao
Jundong Li
Ninghao Liu
    LLMSV
ArXiv (abs)PDFHTML

Papers citing "Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders"

9 / 9 papers shown
Title
Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders
Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders
Xuansheng Wu
Jiayi Yuan
Wenlin Yao
Xiaoming Zhai
Ninghao Liu
LLMSV
147
10
0
24 Feb 2025
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models
Z. He
Haiyan Zhao
Yiran Qiao
Fan Yang
Ali Payani
Jing Ma
Jundong Li
LLMSV
110
9
0
17 Feb 2025
Identifiable Steering via Sparse Autoencoding of Multi-Concept Shifts
Identifiable Steering via Sparse Autoencoding of Multi-Concept Shifts
Shruti Joshi
Andrea Dittadi
Sébastien Lachapelle
Dhanya Sridhar
LLMSV
81
2
0
14 Feb 2025
Adaptive Sparse Allocation with Mutual Choice & Feature Choice Sparse
  Autoencoders
Adaptive Sparse Allocation with Mutual Choice & Feature Choice Sparse Autoencoders
Kola Ayonrinde
68
5
0
04 Nov 2024
Efficient Training of Sparse Autoencoders for Large Language Models via
  Layer Groups
Efficient Training of Sparse Autoencoders for Large Language Models via Layer Groups
Davide Ghilardi
Federico Belotti
Marco Molinari
64
5
0
28 Oct 2024
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Yu Zhao
Alessio Devoto
Giwon Hong
Xiaotang Du
Aryo Pradipta Gema
Hongru Wang
Xuanli He
Kam-Fai Wong
Pasquale Minervini
KELMLLMSV
101
25
0
21 Oct 2024
Identifying Functionally Important Features with End-to-End Sparse
  Dictionary Learning
Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning
Dan Braun
Jordan K. Taylor
Nicholas Goldowsky-Dill
Lee D. Sharkey
70
39
0
17 May 2024
SQuAD: 100,000+ Questions for Machine Comprehension of Text
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
286
8,134
0
16 Jun 2016
Linear Algebraic Structure of Word Senses, with Applications to Polysemy
Linear Algebraic Structure of Word Senses, with Applications to Polysemy
Sanjeev Arora
Yuanzhi Li
Yingyu Liang
Tengyu Ma
Andrej Risteski
83
283
0
14 Jan 2016
1