Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.02080
Cited By
v1
v2 (latest)
Linear Representations of Political Perspective Emerge in Large Language Models
3 March 2025
Junsol Kim
James Evans
Aaron Schein
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Linear Representations of Political Perspective Emerge in Large Language Models"
42 / 42 papers shown
Title
LegiGPT: Party Politics and Transport Policy with Large Language Model
Hyunsoo Yun
Eun Hak Lee
11
0
0
20 Jun 2025
Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective
Bhavik Chandna
Zubair Bashir
Procheta Sen
85
0
0
05 Jun 2025
From Directions to Cones: Exploring Multidimensional Representations of Propositional Facts in LLMs
Stanley Yu
Vaidehi Bulusu
Oscar Yasunaga
Clayton Lau
Cole Blondin
Sean O'Brien
Kevin Zhu
Vasu Sharma
49
0
0
27 May 2025
EuroCon: Benchmarking Parliament Deliberation for Political Consensus Finding
Zhaowei Zhang
Minghua Yi
Mengmeng Wang
Fengshuo Bai
Zilong Zheng
Yipeng Kang
Yaodong Yang
61
1
0
26 May 2025
LLM Social Simulations Are a Promising Research Method
Jacy Reese Anthis
Ryan Liu
Sean M. Richardson
Austin C. Kozlowski
Bernard Koch
James A. Evans
Erik Brynjolfsson
Michael S. Bernstein
ALM
97
15
0
03 Apr 2025
Generative Agent Simulations of 1,000 People
Joon Sung Park
Carolyn Q. Zou
Aaron Shaw
Benjamin Mako Hill
Carrie J. Cai
Meredith Ringel Morris
Robb Willer
Percy Liang
Michael S. Bernstein
SyDa
VGen
LM&Ro
AI4CE
74
103
0
15 Nov 2024
Hidden Persuaders: LLMs' Political Leaning and Their Influence on Voters
Yujin Potter
Shiyang Lai
Junsol Kim
James Evans
Basel Alomair
82
20
0
31 Oct 2024
Refusal in Language Models Is Mediated by a Single Direction
Andy Arditi
Oscar Obeso
Aaquib Syed
Daniel Paleka
Nina Panickssery
Wes Gurnee
Neel Nanda
169
218
0
17 Jun 2024
Dishonesty in Helpful and Harmless Alignment
Youcheng Huang
Jingkun Tang
Duanyu Feng
Zheng Zhang
Wenqiang Lei
Jiancheng Lv
Anthony G. Cohn
LLMSV
91
4
0
04 Jun 2024
Measuring Political Bias in Large Language Models: What Is Said and How It Is Said
Yejin Bang
Delong Chen
Nayeon Lee
Pascale Fung
85
41
0
27 Mar 2024
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
Paul Röttger
Valentin Hofmann
Valentina Pyatkin
Musashi Hinck
Hannah Rose Kirk
Hinrich Schütze
Dirk Hovy
ELM
87
64
0
26 Feb 2024
A Language Model's Guide Through Latent Space
Dimitri von Rutte
Sotiris Anagnostidis
Gregor Bachmann
Thomas Hofmann
105
28
0
22 Feb 2024
Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives
Chen Gao
Xiaochong Lan
Nian Li
Yuan Yuan
Jingtao Ding
Zhilun Zhou
Fengli Xu
Yong Li
LLMAG
AI4CE
LM&Ro
105
132
0
19 Dec 2023
Measurement in the Age of LLMs: An Application to Ideological Scaling
Sean O'Hagan
Aaron Schein
196
11
0
14 Dec 2023
The Linear Representation Hypothesis and the Geometry of Large Language Models
Kiho Park
Yo Joong Choe
Victor Veitch
LLMSV
MILM
170
190
0
07 Nov 2023
Linear Representations of Sentiment in Large Language Models
Curt Tigges
Oskar John Hollinsworth
Atticus Geiger
Neel Nanda
MILM
67
91
0
23 Oct 2023
Towards Understanding Sycophancy in Language Models
Mrinank Sharma
Meg Tong
Tomasz Korbak
David Duvenaud
Amanda Askell
...
Oliver Rausch
Nicholas Schiefer
Da Yan
Miranda Zhang
Ethan Perez
364
246
0
20 Oct 2023
Concept-Guided Chain-of-Thought Prompting for Pairwise Comparison Scoring of Texts with Large Language Models
Patrick Y. Wu
Jonathan Nagler
Joshua A. Tucker
Solomon Messing
LRM
127
3
0
18 Oct 2023
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
Samuel Marks
Max Tegmark
HILM
144
227
0
10 Oct 2023
Simulating Social Media Using Large Language Models to Evaluate Alternative News Feed Algorithms
Petter Törnberg
D. Valeeva
J. Uitermark
Christopher Bail
LLMAG
78
44
0
05 Oct 2023
Language Models Represent Space and Time
Wes Gurnee
Max Tegmark
133
167
0
03 Oct 2023
Emergent Linear Representations in World Models of Self-Supervised Sequence Models
Neel Nanda
Andrew Lee
Martin Wattenberg
FAtt
MILM
120
186
0
02 Sep 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
454
12,106
0
18 Jul 2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
Kenneth Li
Oam Patel
Fernanda Viégas
Hanspeter Pfister
Martin Wattenberg
KELM
HILM
143
584
0
06 Jun 2023
AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction
Junsol Kim
Byungkyu Lee
SyDa
105
37
0
16 May 2023
From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
Shangbin Feng
Chan Young Park
Yuhan Liu
Yulia Tsvetkov
100
248
0
15 May 2023
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
422
1,989
0
07 Apr 2023
Whose Opinions Do Language Models Reflect?
Shibani Santurkar
Esin Durmus
Faisal Ladhak
Cinoo Lee
Percy Liang
Tatsunori Hashimoto
92
447
0
30 Mar 2023
Large Language Models Can Be Used to Estimate the Latent Positions of Politicians
Patrick Y. Wu
Jonathan Nagler
Joshua A. Tucker
Solomon Messing
175
28
0
21 Mar 2023
Language Models as Agent Models
Jacob Andreas
LLMAG
82
141
0
03 Dec 2022
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
200
380
0
21 Sep 2022
CommunityLM: Probing Partisan Worldviews from Language Models
Hang Jiang
Doug Beeferman
Brandon Roy
Dwaipayan Roy
166
32
0
15 Sep 2022
Out of One, Many: Using Language Models to Simulate Human Samples
Lisa P. Argyle
Ethan C. Busby
Nancy Fulda
Joshua R Gubler
Christopher Rytting
David Wingate
SyDa
103
607
0
14 Sep 2022
Assessing Political Prudence of Open-domain Chatbots
Yejin Bang
Nayeon Lee
Etsuko Ishii
Andrea Madotto
Pascale Fung
70
25
0
11 Jun 2021
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
311
457
0
24 Feb 2021
Intrinsic Bias Metrics Do Not Correlate with Application Bias
Seraphina Goldfarb-Tarrant
Rebecca Marchant
Ricardo Muñoz Sánchez
Mugdha Pandya
Adam Lopez
155
180
0
31 Dec 2020
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
Su Lin Blodgett
Solon Barocas
Hal Daumé
Hanna M. Wallach
159
1,257
0
28 May 2020
Measurement and Fairness
Abigail Z. Jacobs
Hanna M. Wallach
90
402
0
11 Dec 2019
Are Sixteen Heads Really Better than One?
Paul Michel
Omer Levy
Graham Neubig
MoE
120
1,070
0
25 May 2019
The Geometry of Culture: Analyzing Meaning through Word Embeddings
Austin C. Kozlowski
Matt Taddy
James A. Evans
58
393
0
25 Mar 2018
Understanding intermediate layers using linear classifier probes
Guillaume Alain
Yoshua Bengio
FAtt
175
958
0
05 Oct 2016
Probabilistic Archetypal Analysis
S. Seth
M. Eugster
91
69
0
29 Dec 2013
1