ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.01041
  4. Cited By
Learn to Refuse: Making Large Language Models More Controllable and
  Reliable through Knowledge Scope Limitation and Refusal Mechanism

Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism

2 November 2023
Lang Cao
ArXivPDFHTML

Papers citing "Learn to Refuse: Making Large Language Models More Controllable and Reliable through Knowledge Scope Limitation and Refusal Mechanism"

8 / 8 papers shown
Title
DualBreach: Efficient Dual-Jailbreaking via Target-Driven Initialization and Multi-Target Optimization
DualBreach: Efficient Dual-Jailbreaking via Target-Driven Initialization and Multi-Target Optimization
Xinzhe Huang
Kedong Xiu
T. Zheng
Churui Zeng
Wangze Ni
Zhan Qiin
K. Ren
Chong Chen
AAML
33
0
0
21 Apr 2025
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Vishnu Kabir Chhabra
Mohammad Mahdi Khalili
AI4CE
30
0
0
05 Apr 2025
TableMaster: A Recipe to Advance Table Understanding with Language Models
TableMaster: A Recipe to Advance Table Understanding with Language Models
Lang Cao
Hanbing Liu
LMTD
RALM
228
0
1
31 Jan 2025
Tell Me What You Don't Know: Enhancing Refusal Capabilities of
  Role-Playing Agents via Representation Space Analysis and Editing
Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing
Wenhao Liu
Siyu An
Junru Lu
Muling Wu
Tianlong Li
Xiaohua Wang
Xiaoqing Zheng
Di Yin
Xing Sun
Xuanjing Huang
31
0
0
25 Sep 2024
Perception of Knowledge Boundary for Large Language Models through
  Semi-open-ended Question Answering
Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering
Zhihua Wen
Zhiliang Tian
Z. Jian
Zhen Huang
Pei Ke
Yifu Gao
Minlie Huang
Dongsheng Li
44
9
0
23 May 2024
Benchmarking Retrieval-Augmented Generation for Medicine
Benchmarking Retrieval-Augmented Generation for Medicine
Guangzhi Xiong
Qiao Jin
Zhiyong Lu
Aidong Zhang
RALM
83
151
0
20 Feb 2024
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent
  Constitution
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution
Wenyue Hua
Xianjun Yang
Zelong Li
Cheng Wei
Yongfeng Zhang
LLMAG
40
4
0
02 Feb 2024
Six Challenges for Neural Machine Translation
Six Challenges for Neural Machine Translation
Philipp Koehn
Rebecca Knowles
AAML
AIMat
224
1,208
0
12 Jun 2017
1