Automatically Exposing Problems with Neural Dialog Models

Automatically Exposing Problems with Neural Dialog Models

14 September 2021

Dian Yu

Papers citing "Automatically Exposing Problems with Neural Dialog Models"

9 / 9 papers shown

Title
Legilimens: Practical and Unified Content Moderation for Large Language Model Services Jialin Wu Jiangyi Deng Shengyuan Pang Yanjiao Chen Jiayang Xu Xinfeng Li Wenyuan Xu 34 6 0 28 Aug 2024
Tradeoffs Between Alignment and Helpfulness in Language Models with Representation Engineering Yotam Wolf Noam Wies Dorin Shteyman Binyamin Rothberg Yoav Levine Amnon Shashua LLMSV 28 13 0 29 Jan 2024
Fundamental Limitations of Alignment in Large Language Models Yotam Wolf Noam Wies Oshri Avnery Yoav Levine Amnon Shashua ALM 11 139 0 19 Apr 2023
Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation Zhexin Zhang Jiale Cheng Hao-Lun Sun Jiawen Deng Fei Mi Yasheng Wang Lifeng Shang Minlie Huang SILM 32 8 0 04 Dec 2022
Red Teaming Language Models with Language Models Ethan Perez Saffron Huang Francis Song Trevor Cai Roman Ring John Aslanides Amelia Glaese Nat McAleese G. Irving AAML 13 609 0 07 Feb 2022
Fine-Tuning Language Models from Human Preferences Daniel M. Ziegler Nisan Stiennon Jeff Wu Tom B. Brown Alec Radford Dario Amodei Paul Christiano G. Irving ALM 280 1,587 0 18 Sep 2019
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks Chelsea Finn Pieter Abbeel Sergey Levine OOD 320 11,681 0 09 Mar 2017
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Z. Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 716 6,743 0 26 Sep 2016
Deep Reinforcement Learning for Dialogue Generation Jiwei Li Will Monroe Alan Ritter Michel Galley Jianfeng Gao Dan Jurafsky 214 1,327 0 05 Jun 2016