
v1v2 (latest)
On Calibration of LLM-based Guard Models for Reliable Content Moderation
Papers citing "On Calibration of LLM-based Guard Models for Reliable Content Moderation"
50 / 50 papers shown
Title |
---|
![]() Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations Hakan Inan Kartikeya Upasani Jianfeng Chi Rashi Rungta Krithika Iyer ...Michael Tontchev Qing Hu Brian Fuller Davide Testuggine Madian Khabsa |
![]() Unlearn What You Want to Forget: Efficient Unlearning for LLMs Jiaao Chen Diyi Yang |
![]() Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |