ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.17693
  4. Cited By
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats

Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats

26 November 2024
Jiaxin Wen
Vivek Hebbar
Caleb Larson
Aryan Bhatt
Ansh Radhakrishnan
Mrinank Sharma
Henry Sleight
Shi Feng
He He
Ethan Perez
Buck Shlegeris
Akbir Khan
    AAML
ArXiv (abs)PDFHTML

Papers citing "Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats"

3 / 3 papers shown
Title
SHADE-Arena: Evaluating Sabotage and Monitoring in LLM Agents
SHADE-Arena: Evaluating Sabotage and Monitoring in LLM Agents
Jonathan Kutasov
Yuqi Sun
Paul Colognese
Teun van der Weij
Linda Petrini
...
Xiang Deng
Henry Sleight
Tyler Tracy
Buck Shlegeris
Joe Benton
LLMAG
28
0
0
17 Jun 2025
Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors
Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors
Chen Yueh-Han
Nitish Joshi
Yulin Chen
Maksym Andriushchenko
Rico Angell
He He
AAML
104
0
0
12 Jun 2025
A sketch of an AI control safety case
A sketch of an AI control safety case
Tomek Korbak
Joshua Clymer
Benjamin Hilton
Buck Shlegeris
Geoffrey Irving
147
10
0
28 Jan 2025
1