
Title |
|---|
![]() Steering Llama 2 via Contrastive Activation AdditionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 |
![]() SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to
RLHFConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 |
![]() Mass-Editing Memory in a TransformerInternational Conference on Learning Representations (ICLR), 2022 |
![]() Extracting Latent Steering Vectors from Pretrained Language ModelsFindings (Findings), 2022 |
![]() Locating and Editing Factual Associations in GPTNeural Information Processing Systems (NeurIPS), 2022 |