ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.21514
  4. Cited By
Sabotage Evaluations for Frontier Models

Sabotage Evaluations for Frontier Models

28 October 2024
Joe Benton
Misha Wagner
Eric Christiansen
Cem Anil
Ethan Perez
Jai Srivastav
Esin Durmus
Deep Ganguli
Shauna Kravec
Buck Shlegeris
Jared Kaplan
Holden Karnofsky
Evan Hubinger
Roger C. Grosse
Samuel R. Bowman
David Duvenaud
    ELM
ArXiv (abs)PDFHTML

Papers citing "Sabotage Evaluations for Frontier Models"

1 / 1 papers shown
Title
Evaluation Faking: Unveiling Observer Effects in Safety Evaluation of Frontier AI Systems
Evaluation Faking: Unveiling Observer Effects in Safety Evaluation of Frontier AI Systems
Yihe Fan
Wenqi Zhang
Xudong Pan
Min Yang
75
0
0
23 May 2025
1