A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers

20 May 2024

Papers citing "A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers"

4 / 4 papers shown

Title
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation Saurabh Kumar Pandey S. Vashistha Debrup Das Somak Aditya Monojit Choudhury AAML 69 0 0 10 Feb 2025
Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer Fanchao Qi Yangyi Chen Xurui Zhang Mukai Li Zhiyuan Liu Maosong Sun AAML SILM 82 175 0 14 Oct 2021
Fine-Tuning Language Models from Human Preferences Daniel M. Ziegler Nisan Stiennon Jeff Wu Tom B. Brown Alec Radford Dario Amodei Paul Christiano G. Irving ALM 280 1,587 0 18 Sep 2019
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks Mohit Iyyer John Wieting Kevin Gimpel Luke Zettlemoyer AAML GAN 196 711 0 17 Apr 2018