Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.21030
Cited By
Standards for Belief Representations in LLMs
31 May 2024
Daniel A. Herrmann
B. Levinstein
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Standards for Belief Representations in LLMs"
11 / 11 papers shown
Title
Replicating Human Social Perception in Generative AI: Evaluating the Valence-Dominance Model
Necdet Gurkan
Kimathi Njoki
Jordan W. Suchow
43
0
0
05 Mar 2025
Representation Engineering for Large-Language Models: Survey and Research Challenges
Lukasz Bartoszcze
Sarthak Munshi
Bryan Sukidi
Jennifer Yen
Zejia Yang
David Williams-King
Linh Le
Kosi Asuzu
Carsten Maple
102
0
0
24 Feb 2025
Representation in large language models
Cameron C. Yetman
41
1
0
03 Jan 2025
Chatting with Bots: AI, Speech Acts, and the Edge of Assertion
Iwan Williams
Tim Bayne
34
1
0
22 Oct 2024
On the attribution of confidence to large language models
Geoff Keeling
Winnie Street
LRM
29
2
0
11 Jul 2024
Specific versus General Principles for Constitutional AI
Sandipan Kundu
Yuntao Bai
Saurav Kadavath
Amanda Askell
Andrew Callahan
...
Zac Hatfield-Dodds
Sören Mindermann
Nicholas Joseph
Sam McCandlish
Jared Kaplan
AILaw
58
26
0
20 Oct 2023
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets
Samuel Marks
Max Tegmark
HILM
102
168
0
10 Oct 2023
The Internal State of an LLM Knows When It's Lying
A. Azaria
Tom Michael Mitchell
HILM
218
299
0
26 Apr 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
289
3,003
0
22 Mar 2023
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
122
317
0
21 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
313
11,915
0
04 Mar 2022
1