Preemptive Detection and Steering of LLM Misalignment via Latent Reachability

25 September 2025

Papers citing "Preemptive Detection and Steering of LLM Misalignment via Latent Reachability"

2 / 2 papers shown

Title
RepV: Safety-Separable Latent Spaces for Scalable Neurosymbolic Plan Verification Yunhao Yang N. Bhatt Pranay Samineni Rohan Siva Zhanyang Wang Ufuk Topcu 4 0 0 30 Oct 2025
From Refusal to Recovery: A Control-Theoretic Approach to Generative AI Guardrails Ravi Pandya Madison Bland D. Nguyen Changliu Liu J. F. Fisac Andrea V. Bajcsy 32 0 0 15 Oct 2025