ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.19278
  4. Cited By
Applying sparse autoencoders to unlearn knowledge in language models

Applying sparse autoencoders to unlearn knowledge in language models

25 October 2024
Eoin Farrell
Yeu-Tong Lau
Arthur Conmy
    MU
ArXivPDFHTML

Papers citing "Applying sparse autoencoders to unlearn knowledge in language models"

5 / 5 papers shown
Title
Are Sparse Autoencoders Useful for Java Function Bug Detection?
Are Sparse Autoencoders Useful for Java Function Bug Detection?
Rui Melo
Claudia Mamede
Andre Catarino
Rui Abreu
Henrique Lopes Cardoso
31
0
0
15 May 2025
Understanding the Repeat Curse in Large Language Models from a Feature Perspective
Understanding the Repeat Curse in Large Language Models from a Feature Perspective
Junchi Yao
Shu Yang
Jianhua Xu
Lijie Hu
Mengdi Li
Di Wang
27
0
0
19 Apr 2025
Steering off Course: Reliability Challenges in Steering Language Models
Steering off Course: Reliability Challenges in Steering Language Models
Patrick Queiroz Da Silva
Hari Sethuraman
Dheeraj Rajagopal
Hannaneh Hajishirzi
Sachin Kumar
LLMSV
37
1
0
06 Apr 2025
SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders
SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders
Bartosz Cywiñski
Kamil Deja
DiffM
63
6
0
29 Jan 2025
Tracking the Feature Dynamics in LLM Training: A Mechanistic Study
Tracking the Feature Dynamics in LLM Training: A Mechanistic Study
Yang Xu
Yansen Wang
Hao Wang
168
1
0
23 Dec 2024
1