ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.10427
  4. Cited By
Identifying and Manipulating Personality Traits in LLMs Through Activation Engineering
v1v2 (latest)

Identifying and Manipulating Personality Traits in LLMs Through Activation Engineering

10 December 2024
Rumi A. Allbert
James K. Wiles
Vlad Grankovsky
    LLMSVAI4CE
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Identifying and Manipulating Personality Traits in LLMs Through Activation Engineering"

11 / 11 papers shown
Title
Neural Transparency: Mechanistic Interpretability Interfaces for Anticipating Model Behaviors for Personalized AI
Neural Transparency: Mechanistic Interpretability Interfaces for Anticipating Model Behaviors for Personalized AI
Sheer Karny
Anthony Baez
Pat Pataranutaporn
AAML
121
0
0
31 Oct 2025
Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
Wenhan Dong
Yuemeng Zhao
Zhen Sun
Yule Liu
Zifan Peng
...
Jun Wu
Ruiming Wang
Shengmin Xu
Xinyi Huang
Xinlei He
LLMAG
422
5
0
30 Apr 2025
Self-assessment, Exhibition, and Recognition: a Review of Personality in
  Large Language Models
Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models
Zhiyuan Wen
Yu Yang
Jiannong Cao
Haoming Sun
Ruosong Yang
Shuaiqi Liu
198
7
0
25 Jun 2024
Steering Llama 2 via Contrastive Activation Addition
Steering Llama 2 via Contrastive Activation AdditionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Nina Rimsky
Nick Gabrieli
Julian Schulz
Meg Tong
Evan Hubinger
Alexander Matt Turner
LLMSV
305
400
0
09 Dec 2023
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to
  RLHF
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHFConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yi Dong
Zhilin Wang
Makesh Narsimhan Sreedhar
Xianchao Wu
Oleksii Kuchaiev
ALMLLMSV
222
90
0
09 Oct 2023
Personality Traits in Large Language Models
Personality Traits in Large Language Models
Gregory Serapio-García
Mustafa Safdari
Clément Crepy
Luning Sun
Stephen Fitz
P. Romero
Marwa Abdulhai
Aleksandra Faust
Maja J. Matarić
LM&MALLMAG
534
173
0
01 Jul 2023
PersonaLLM: Investigating the Ability of Large Language Models to
  Express Personality Traits
PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits
Hang Jiang
Xiajie Zhang
Xubo Cao
Cynthia Breazeal
Deb Roy
Jad Kabbara
207
180
0
04 May 2023
Mass-Editing Memory in a Transformer
Mass-Editing Memory in a TransformerInternational Conference on Learning Representations (ICLR), 2022
Kevin Meng
Arnab Sen Sharma
A. Andonian
Yonatan Belinkov
David Bau
KELMVLM
337
761
0
13 Oct 2022
Extracting Latent Steering Vectors from Pretrained Language Models
Extracting Latent Steering Vectors from Pretrained Language ModelsFindings (Findings), 2022
Nishant Subramani
Nivedita Suresh
Matthew E. Peters
LLMSV
154
133
0
10 May 2022
Locating and Editing Factual Associations in GPT
Locating and Editing Factual Associations in GPTNeural Information Processing Systems (NeurIPS), 2022
Kevin Meng
David Bau
A. Andonian
Yonatan Belinkov
KELM
811
1,851
0
10 Feb 2022
What do Neural Machine Translation Models Learn about Morphology?
What do Neural Machine Translation Models Learn about Morphology?
Yonatan Belinkov
Nadir Durrani
Fahim Dalvi
Hassan Sajjad
James R. Glass
284
426
0
11 Apr 2017
1