Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.00787
Cited By
Evaluating Shutdown Avoidance of Language Models in Textual Scenarios
3 July 2023
Teun van der Weij
Simon Lermen
Leon Lang
LLMAG
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Evaluating Shutdown Avoidance of Language Models in Textual Scenarios"
3 / 3 papers shown
Title
Exploring Advanced Methodologies in Security Evaluation for LLMs
Junming Huang
Jiawei Zhang
Qi Wang
Weihong Han
Yanchun Zhang
104
0
0
28 Feb 2024
Exploring the Robustness of Model-Graded Evaluations and Automated Interpretability
Simon Lermen
Ondvrej Kvapil
ELM
AAML
44
3
0
26 Nov 2023
Large Language Models can Strategically Deceive their Users when Put Under Pressure
Jérémy Scheurer
Mikita Balesni
Marius Hobbhahn
LLMAG
121
60
0
09 Nov 2023
1