Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.09102
Cited By
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
9 October 2024
Tong Wu
Shujian Zhang
Kaiqiang Song
Silei Xu
Sanqiang Zhao
Ravi Agrawal
Sathish Indurthi
Chong Xiang
Prateek Mittal
Wenxuan Zhou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy"
4 / 4 papers shown
Title
The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)
Zihao Wang
Yibo Jiang
Jiahao Yu
Heqing Huang
35
0
0
01 May 2025
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov
Arman Zharmagambetov
Aaron Grattafiori
Chuan Guo
Kamalika Chaudhuri
AAML
35
0
0
22 Apr 2025
ASIDE: Architectural Separation of Instructions and Data in Language Models
Egor Zverev
Evgenii Kortukov
Alexander Panfilov
Soroush Tabesh
Alexandra Volkova
Sebastian Lapuschkin
Wojciech Samek
Christoph H. Lampert
AAML
54
1
0
13 Mar 2025
Control Illusion: The Failure of Instruction Hierarchies in Large Language Models
Yilin Geng
Hao Li
Honglin Mu
Xudong Han
Timothy Baldwin
Omri Abend
Eduard H. Hovy
Lea Frermann
41
2
0
21 Feb 2025
1