ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.20087
  4. Cited By
Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models

Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models

26 May 2025
Makesh Narsimhan Sreedhar
Traian Rebedea
Christopher Parisien
    LRM
ArXiv (abs)PDFHTML

Papers citing "Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models"

5 / 5 papers shown
Title
X-Guard: Multilingual Guard Agent for Content Moderation
X-Guard: Multilingual Guard Agent for Content Moderation
Bibek Upadhayay
Vahid Behzadan
Ph.D
102
3
0
11 Apr 2025
Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking
Reasoning-to-Defend: Safety-Aware Reasoning Can Defend Large Language Models from Jailbreaking
Junda Zhu
Lingyong Yan
Shuaiqiang Wang
Dawei Yin
Lei Sha
AAMLLRM
98
6
0
18 Feb 2025
GuardReasoner: Towards Reasoning-based LLM Safeguards
Yue Liu
Hongcheng Gao
Shengfang Zhai
Jun Xia
Tianyi Wu
Zhiwei Xue
Yuxiao Chen
Kenji Kawaguchi
Jiaheng Zhang
Bryan Hooi
AI4TSLRM
276
26
0
30 Jan 2025
Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails
Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails
Shaona Ghosh
Prasoon Varshney
Makesh Narsimhan Sreedhar
Aishwarya Padmakumar
Traian Rebedea
Jibin Rajan Varghese
Christopher Parisien
137
16
0
15 Jan 2025
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements
Jingyu Zhang
Ahmed Elgohary
Ahmed Magooda
Daniel Khashabi
Benjamin Van Durme
473
8
0
11 Oct 2024
1