v1v2 (latest)

Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse Reinforcement Learning

16 October 2024

Nyal Patel

Satyapriya Krishna

Sonali Parbhoo

Papers citing "Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse Reinforcement Learning"

1 / 1 papers shown

Title
Effective Red-Teaming of Policy-Adherent Agents Itay Nakash George Kour Koren Lazar Matan Vetzler Guy Uziel Ateret Anaby-Tavor AAML 95 0 0 11 Jun 2025