Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.04962
Cited By
Activation Scaling for Steering and Interpreting Language Models
7 October 2024
Niklas Stoehr
Kevin Du
Vésteinn Snæbjarnarson
Robert West
Ryan Cotterell
Aaron Schein
LLMSV
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Activation Scaling for Steering and Interpreting Language Models"
2 / 2 papers shown
Title
Better Estimation of the KL Divergence Between Language Models
Afra Amini
Tim Vieira
Ryan Cotterell
51
0
0
14 Apr 2025
Controllable Context Sensitivity and the Knob Behind It
Julian Minder
Kevin Du
Niklas Stoehr
Giovanni Monea
Chris Wendler
Robert West
Ryan Cotterell
KELM
55
3
0
11 Nov 2024
1