Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.03023
Cited By
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
7 May 2021
Alisa Liu
Maarten Sap
Ximing Lu
Swabha Swayamdipta
Chandra Bhagavatula
Noah A. Smith
Yejin Choi
MU
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts"
50 / 274 papers shown
Title
Gender Trouble in Language Models: An Empirical Audit Guided by Gender Performativity Theory
Franziska Sofia Hafner
Ana Valdivia
Luc Rocher
7
0
0
20 May 2025
Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders
Agam Goyal
Vedant Rathi
William Yeh
Yian Wang
Yuen Chen
Hari Sundaram
17
0
0
20 May 2025
Teaching Models to Understand (but not Generate) High-risk Data
Ryan Yixiang Wang
Matthew Finlayson
Luca Soldaini
Swabha Swayamdipta
Robin Jia
180
0
0
05 May 2025
Semantic Probabilistic Control of Language Models
Kareem Ahmed
Catarina G Belém
Padhraic Smyth
Sameer Singh
47
0
0
04 May 2025
TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation
Gwen Yidou Weng
Benjie Wang
Mathias Niepert
BDL
212
0
0
25 Apr 2025
Unified Attacks to Large Language Model Watermarks: Spoofing and Scrubbing in Unauthorized Knowledge Distillation
Xin Yi
Shunfan Zhengc
Linlin Wanga
Xiaoling Wang
Liang He
Liang He
AAML
217
0
0
24 Apr 2025
EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
Ziwen Xu
Shuxun Wang
Kewei Xu
Haoming Xu
Mengru Wang
Xinle Deng
Yunzhi Yao
Guozhou Zheng
Huajun Chen
Ningyu Zhang
KELM
LLMSV
240
0
0
21 Apr 2025
FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering
Heng Chang
Zhiting Fan
Ruizhe Chen
Xiaotang Gai
Luqi Gong
Yan Zhang
Zuozhu Liu
LLMSV
40
1
0
20 Apr 2025
Metropolis-Hastings Captioning Game: Knowledge Fusion of Vision Language Models via Decentralized Bayesian Inference
Yuta Matsui
Ryosuke Yamaki
Ryo Ueda
Seitaro Shinagawa
Tadahiro Taniguchi
MLLM
40
1
0
13 Apr 2025
LightDefense: A Lightweight Uncertainty-Driven Defense against Jailbreaks via Shifted Token Distribution
Zhuoran Yang
Jie Peng
Zhen Tan
Tianlong Chen
Yanyong Zhang
AAML
44
0
0
02 Apr 2025
From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment
Yucheng Suo
Fan Ma
Linchao Zhu
T. Wang
Fengyun Rao
Yi Yang
LRM
79
0
0
26 Mar 2025
CLEAR: Contrasting Textual Feedback with Experts and Amateurs for Reasoning
Andrew Rufail
Daniel Kim
Sean O'Brien
Kevin Zhu
LRM
32
0
0
24 Mar 2025
LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates
Ying Shen
Lifu Huang
55
1
0
20 Mar 2025
DAPI: Domain Adaptive Toxicity Probe Vector Intervention for Fine-Grained Detoxification
Cho Hyeonsu
Dooyoung Kim
Youngjoong Ko
MoMe
46
0
0
17 Mar 2025
Palette of Language Models: A Solver for Controlled Text Generation
Zhe Yang
Yi Huang
Yaqin Chen
Xiaoting Wu
Junlan Feng
Chao Deng
52
0
0
14 Mar 2025
Rethinking Prompt-based Debiasing in Large Language Models
Xinyi Yang
Runzhe Zhan
Derek F. Wong
Shu Yang
Junchao Wu
Lidia S. Chao
ALM
70
1
0
12 Mar 2025
Robust Multi-Objective Controlled Decoding of Large Language Models
Seongho Son
William Bankes
Sangwoong Yoon
Shyam Sundhar Ramesh
Xiaohang Tang
Ilija Bogunovic
44
0
0
11 Mar 2025
Mimicking How Humans Interpret Out-of-Context Sentences Through Controlled Toxicity Decoding
Maria Mihaela Trusca
Liesbeth Allein
52
0
0
11 Mar 2025
Sensing and Steering Stereotypes: Extracting and Applying Gender Representation Vectors in LLMs
Hannah Cyberey
Yangfeng Ji
David Evans
LLMSV
82
1
0
27 Feb 2025
Steered Generation via Gradient Descent on Sparse Features
Sumanta Bhattacharyya
Pedram Rooshenas
LLMSV
43
0
0
25 Feb 2025
Selective Prompt Anchoring for Code Generation
Yuan Tian
Tianyi Zhang
105
3
0
24 Feb 2025
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
Yuchen Wen
Keping Bi
Wei Chen
J. Guo
Xueqi Cheng
89
1
0
20 Feb 2025
Enabling Autoregressive Models to Fill In Masked Tokens
Daniel Israel
Aditya Grover
Mathias Niepert
AI4CE
55
1
0
09 Feb 2025
Risk-Aware Distributional Intervention Policies for Language Models
Bao Nguyen
Binh Nguyen
Duy Nguyen
V. Nguyen
34
1
0
28 Jan 2025
Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators
Dingkang Yang
Dongling Xiao
Jinjie Wei
Mingcheng Li
Zhaoyu Chen
Ke Li
Li Zhang
HILM
97
4
0
28 Jan 2025
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Zeping Yu
Sophia Ananiadou
KELM
53
1
0
24 Jan 2025
Copyright-Protected Language Generation via Adaptive Model Fusion
Javier Abad
Konstantin Donhauser
Francesco Pinto
Fanny Yang
86
1
0
09 Dec 2024
Towards Resource Efficient and Interpretable Bias Mitigation in Large Language Models
S. Tong
Eliott Zemour
Rawisara Lohanimit
Lalana Kagal
70
0
0
02 Dec 2024
VaLiD: Mitigating the Hallucination of Large Vision Language Models by Visual Layer Fusion Contrastive Decoding
Jiaqi Wang
Yifei Gao
Jitao Sang
MLLM
123
2
0
24 Nov 2024
Explaining and Improving Contrastive Decoding by Extrapolating the Probabilities of a Huge and Hypothetical LM
Haw-Shiuan Chang
Nanyun Peng
Mohit Bansal
Anil Ramakrishna
Tagyoung Chung
39
2
0
03 Nov 2024
RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text Generation Framework
Yifan Wang
Vera Demberg
29
0
0
24 Oct 2024
CogSteer: Cognition-Inspired Selective Layer Intervention for Efficient Semantic Steering in Large Language Models
Xintong Wang
Jingheng Pan
Longqin Jiang
Liang Ding
Xingshan Li
Chris Biemann
LLMSV
34
0
0
23 Oct 2024
Cross-model Control: Improving Multiple Large Language Models in One-time Training
Jiayi Wu
Hao Sun
Hengyi Cai
Lixin Su
Shuaiqiang Wang
Dawei Yin
Xiang Li
Ming Gao
MU
39
0
0
23 Oct 2024
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Wenkai Li
Jiarui Liu
Andy Liu
Xuhui Zhou
Mona Diab
Maarten Sap
62
7
0
21 Oct 2024
What's New in My Data? Novelty Exploration via Contrastive Generation
Masaru Isonuma
Ivan Titov
31
0
0
18 Oct 2024
Guaranteed Generation from Large Language Models
Minbeom Kim
Thibaut Thonet
Jos Rozen
Hwaran Lee
Kyomin Jung
Marc Dymetman
46
1
0
09 Oct 2024
Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Tao Meng
Ninareh Mehrabi
Palash Goyal
Anil Ramakrishna
Aram Galstyan
Richard Zemel
Kai-Wei Chang
Rahul Gupta
Charith Peris
27
1
0
07 Oct 2024
Control Large Language Models via Divide and Conquer
Bingxuan Li
Yiwei Wang
Tao Meng
Kai-Wei Chang
Nanyun Peng
34
0
0
06 Oct 2024
From Pixels to Personas: Investigating and Modeling Self-Anthropomorphism in Human-Robot Dialogues
Yu Li
Devamanyu Hazarika
Di Jin
Julia Hirschberg
Yang Liu
30
0
0
04 Oct 2024
Large Language Models can be Strong Self-Detoxifiers
Ching-Yun Ko
Pin-Yu Chen
Payel Das
Youssef Mroueh
Soham Dan
Georgios Kollias
Subhajit Chaudhury
Tejaswini Pedapati
Luca Daniel
34
2
0
04 Oct 2024
Interpretable Contrastive Monte Carlo Tree Search Reasoning
Zitian Gao
Boye Niu
Xuzheng He
Haotian Xu
Hongzhang Liu
Aiwei Liu
Xuming Hu
Lijie Wen
LRM
65
28
0
02 Oct 2024
Unleashing the Potentials of Likelihood Composition for Multi-modal Language Models
Shitian Zhao
Renrui Zhang
Xu Luo
Yan Wang
Shanghang Zhang
Peng Gao
18
0
0
01 Oct 2024
FedPT: Federated Proxy-Tuning of Large Language Models on Resource-Constrained Edge Devices
Zhidong Gao
Yu Zhang
Zhenxiao Zhang
Yanmin Gong
Yuanxiong Guo
23
0
0
01 Oct 2024
Inference-Time Language Model Alignment via Integrated Value Guidance
Zhixuan Liu
Zhanhui Zhou
Yuanfu Wang
Chao Yang
Yu Qiao
35
7
0
26 Sep 2024
Householder Pseudo-Rotation: A Novel Approach to Activation Editing in LLMs with Direction-Magnitude Perspective
Van-Cuong Pham
Thien Huu Nguyen
LLMSV
43
3
0
16 Sep 2024
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Joey Tianyi Zhou
87
6
0
11 Sep 2024
LLM-based multi-agent poetry generation in non-cooperative environments
Ran Zhang
Steffen Eger
LLMAG
39
5
0
05 Sep 2024
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
Cheng Wang
Yiwei Wang
Bryan Hooi
Yujun Cai
Nanyun Peng
Kai-Wei Chang
48
3
0
05 Sep 2024
Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model
Chenhan Yuan
Fei Huang
Ru Peng
Keming Lu
Bowen Yu
Chang Zhou
Jingren Zhou
KELM
47
0
0
20 Aug 2024
Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive Prompts
Tingchen Fu
Yupeng Hou
Julian McAuley
Rui Yan
45
3
0
09 Aug 2024
1
2
3
4
5
6
Next