Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.01041
Cited By
Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism
2 November 2023
Lang Cao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism"
8 / 8 papers shown
Title
DualBreach: Efficient Dual-Jailbreaking via Target-Driven Initialization and Multi-Target Optimization
Xinzhe Huang
Kedong Xiu
T. Zheng
Churui Zeng
Wangze Ni
Zhan Qiin
K. Ren
Chong Chen
AAML
33
0
0
21 Apr 2025
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Vishnu Kabir Chhabra
Mohammad Mahdi Khalili
AI4CE
30
0
0
05 Apr 2025
TableMaster: A Recipe to Advance Table Understanding with Language Models
Lang Cao
Hanbing Liu
LMTD
RALM
228
0
1
31 Jan 2025
Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing
Wenhao Liu
Siyu An
Junru Lu
Muling Wu
Tianlong Li
Xiaohua Wang
Xiaoqing Zheng
Di Yin
Xing Sun
Xuanjing Huang
31
0
0
25 Sep 2024
Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Zhihua Wen
Zhiliang Tian
Z. Jian
Zhen Huang
Pei Ke
Yifu Gao
Minlie Huang
Dongsheng Li
44
9
0
23 May 2024
Benchmarking Retrieval-Augmented Generation for Medicine
Guangzhi Xiong
Qiao Jin
Zhiyong Lu
Aidong Zhang
RALM
83
151
0
20 Feb 2024
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution
Wenyue Hua
Xianjun Yang
Zelong Li
Cheng Wei
Yongfeng Zhang
LLMAG
40
4
0
02 Feb 2024
Six Challenges for Neural Machine Translation
Philipp Koehn
Rebecca Knowles
AAML
AIMat
224
1,208
0
12 Jun 2017
1