Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.13919
Cited By
LLM Agent Honeypot: Monitoring AI Hacking Agents in the Wild
17 October 2024
Reworr
Dmitrii Volkov
LLMAG
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLM Agent Honeypot: Monitoring AI Hacking Agents in the Wild"
2 / 2 papers shown
Title
Winning at All Cost: A Small Environment for Eliciting Specification Gaming Behaviors in Large Language Models
Lars Malmqvist
21
0
0
07 May 2025
Demonstrating specification gaming in reasoning models
Alexander Bondarenko
Denis Volk
Dmitrii Volkov
Jeffrey Ladish
LRM
LLMAG
44
3
0
18 Feb 2025
1