Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.12798
Cited By
Word Embeddings Are Steers for Language Models
22 May 2023
Chi Han
Jialiang Xu
Manling Li
Yi R. Fung
Chenkai Sun
Nan Jiang
Tarek F. Abdelzaher
Heng Ji
LLMSV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Word Embeddings Are Steers for Language Models"
14 / 14 papers shown
Title
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives
Ayoung Lee
Ryan Sungmo Kwon
Peter Railton
Lu Wang
ELM
51
0
0
15 Apr 2025
Personalize Your LLM: Fake it then Align it
Yijing Zhang
Dyah Adila
Changho Shin
Frederic Sala
88
0
0
02 Mar 2025
PropaInsight: Toward Deeper Understanding of Propaganda in Terms of Techniques, Appeals, and Intent
Jiateng Liu
Lin Ai
Zizhou Liu
Payam Karisani
Zheng Hui
May Fung
Preslav Nakov
Julia Hirschberg
Heng Ji
DiffM
90
4
0
17 Feb 2025
Evaluating the Prompt Steerability of Large Language Models
Erik Miehling
Michael Desmond
K. Ramamurthy
Elizabeth M. Daly
Pierre L. Dognin
Jesus Rios
Djallel Bouneffouf
Miao Liu
LLMSV
89
3
0
19 Nov 2024
Focus On This, Not That! Steering LLMs With Adaptive Feature Specification
Tom A. Lamb
Adam Davies
Alasdair Paren
Philip Torr
Francesco Pinto
52
0
0
30 Oct 2024
Programming Refusal with Conditional Activation Steering
Bruce W. Lee
Inkit Padhi
K. Ramamurthy
Erik Miehling
Pierre L. Dognin
Manish Nagireddy
Amit Dhurandhar
LLMSV
105
14
0
06 Sep 2024
Continuous Language Model Interpolation for Dynamic and Controllable Text Generation
Sara Kangaslahti
David Alvarez-Melis
KELM
34
0
0
10 Apr 2024
Controlled Text Generation with Natural Language Instructions
Wangchunshu Zhou
Yuchen Eleanor Jiang
Ethan Gotlieb Wilcox
Ryan Cotterell
Mrinmaya Sachan
160
84
0
27 Apr 2023
NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
Yi R. Fung
Tuhin Chakraborty
Hao Guo
Owen Rambow
Smaranda Muresan
Heng Ji
21
39
0
16 Oct 2022
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
259
374
0
28 Feb 2021
Debiasing Pre-trained Contextualised Embeddings
Masahiro Kaneko
Danushka Bollegala
218
138
0
23 Jan 2021
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
223
618
0
03 Sep 2019
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
281
31,267
0
16 Jan 2013
1