ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.09004
  4. Cited By
Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails

Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails

15 January 2025
Shaona Ghosh
Prasoon Varshney
Makesh Narsimhan Sreedhar
Aishwarya Padmakumar
Traian Rebedea
Jibin Rajan Varghese
Christopher Parisien
ArXiv (abs)PDFHTML

Papers citing "Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails"

12 / 12 papers shown
Title
Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs
Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs
Hiroshi Matsuda
Chunpeng Ma
Masayuki Asahara
99
0
0
11 Jun 2025
Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment
Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment
Kundan Krishna
Joseph Y Cheng
Charles Maalouf
Leon A Gatys
30
0
0
30 May 2025
OMNIGUARD: An Efficient Approach for AI Safety Moderation Across Modalities
OMNIGUARD: An Efficient Approach for AI Safety Moderation Across Modalities
Sahil Verma
Keegan E. Hines
J. Bilmes
Charlotte Siska
Luke Zettlemoyer
Hila Gonen
Chandan Singh
AAML
24
0
0
29 May 2025
Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models
Safety Through Reasoning: An Empirical Study of Reasoning Guardrail Models
Makesh Narsimhan Sreedhar
Traian Rebedea
Christopher Parisien
LRM
97
0
0
26 May 2025
Surfacing Semantic Orthogonality Across Model Safety Benchmarks: A Multi-Dimensional Analysis
Surfacing Semantic Orthogonality Across Model Safety Benchmarks: A Multi-Dimensional Analysis
Jonathan Bennion
Shaona Ghosh
Mantek Singh
Nouha Dziri
174
0
0
23 May 2025
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Yang Liu
Shengfang Zhai
Mingzhe Du
Yulin Chen
Tri Cao
...
Xuzhao Li
Kun Wang
Junfeng Fang
Jiaheng Zhang
Bryan Hooi
OffRLLRM
107
3
0
16 May 2025
Safety in Large Reasoning Models: A Survey
Safety in Large Reasoning Models: A Survey
Cheng Wang
Yang Liu
Yangqiu Song
Duzhen Zhang
Zechao Li
...
Shengju Yu
Xinfeng Li
Junfeng Fang
Jiaheng Zhang
Bryan Hooi
LRM
461
14
0
24 Apr 2025
MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety
MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety
Yahan Yang
Soham Dan
Shuo Li
Dan Roth
Insup Lee
LRM
99
0
0
21 Apr 2025
The Structural Safety Generalization Problem
The Structural Safety Generalization Problem
Julius Broomfield
Tom Gibbs
Ethan Kosak-Hine
George Ingebretsen
Tia Nasir
Jason Zhang
Reihaneh Iranmanesh
Sara Pieri
Reihaneh Rabbany
Kellin Pelrine
AAML
102
0
0
13 Apr 2025
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages
Priyanshu Kumar
Devansh Jain
Akhila Yerukola
Liwei Jiang
Himanshu Beniwal
Thomas Hartvigsen
Maarten Sap
117
1
0
06 Apr 2025
KSOD: Knowledge Supplement for LLMs On Demand
Haoran Li
Junfeng Hu
105
0
0
10 Mar 2025
GuardReasoner: Towards Reasoning-based LLM Safeguards
Yue Liu
Hongcheng Gao
Shengfang Zhai
Jun Xia
Tianyi Wu
Zhiwei Xue
Yuxiao Chen
Kenji Kawaguchi
Jiaheng Zhang
Bryan Hooi
AI4TSLRM
278
26
0
30 Jan 2025
1