Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.15710
Cited By
Advancing LLM Safe Alignment with Safety Representation Ranking
21 May 2025
Tianqi Du
Zeming Wei
Quan Chen
Chenheng Zhang
Yisen Wang
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Advancing LLM Safe Alignment with Safety Representation Ranking"
4 / 4 papers shown
Title
ReGA: Representation-Guided Abstraction for Model-based Safeguarding of LLMs
Zeming Wei
Chengcan Wu
Meng Sun
57
0
0
02 Jun 2025
LiPO: Listwise Preference Optimization through Learning-to-Rank
Tianqi Liu
Zhen Qin
Junru Wu
Jiaming Shen
Misha Khalman
...
Mohammad Saleh
Simon Baumgartner
Jialu Liu
Peter J. Liu
Xuanhui Wang
328
60
0
28 Jan 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
304
331
0
03 Jan 2025
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
Tinghao Xie
Xiangyu Qi
Yi Zeng
Yangsibo Huang
Udari Madhushani Sehwag
...
Bo Li
Kai Li
Danqi Chen
Peter Henderson
Prateek Mittal
ALM
ELM
189
79
0
20 Jun 2024
1