Unmasking the Shadows of AI: Investigating Deceptive Capabilities in Large Language Models

7 February 2024

Papers citing "Unmasking the Shadows of AI: Investigating Deceptive Capabilities in Large Language Models"

1 / 1 papers shown

Title
Towards Safe and Honest AI Agents with Neural Self-Other Overlap Marc Carauleanu Michael Vaiana Judd Rosenblatt Cameron Berg Diogo Schwerz de Lucena 68 0 0 20 Dec 2024