Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.02922
Cited By
Robustly identifying concepts introduced during chat fine-tuning using crosscoders
3 April 2025
Julian Minder
Clement Dumas
Caden Juang
Bilal Chugtai
Neel Nanda
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Robustly identifying concepts introduced during chat fine-tuning using crosscoders"
1 / 1 papers shown
Title
Response Uncertainty and Probe Modeling: Two Sides of the Same Coin in LLM Interpretability?
Yongjie Wang
Yibo Wang
Xin Zhou
Zhiqi Shen
5
0
0
24 May 2025
1