Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.06950
Cited By
Automatically Exposing Problems with Neural Dialog Models
14 September 2021
Dian Yu
Kenji Sagae
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Automatically Exposing Problems with Neural Dialog Models"
9 / 9 papers shown
Title
Legilimens: Practical and Unified Content Moderation for Large Language Model Services
Jialin Wu
Jiangyi Deng
Shengyuan Pang
Yanjiao Chen
Jiayang Xu
Xinfeng Li
Wenyuan Xu
34
6
0
28 Aug 2024
Tradeoffs Between Alignment and Helpfulness in Language Models with Representation Engineering
Yotam Wolf
Noam Wies
Dorin Shteyman
Binyamin Rothberg
Yoav Levine
Amnon Shashua
LLMSV
28
13
0
29 Jan 2024
Fundamental Limitations of Alignment in Large Language Models
Yotam Wolf
Noam Wies
Oshri Avnery
Yoav Levine
Amnon Shashua
ALM
11
139
0
19 Apr 2023
Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation
Zhexin Zhang
Jiale Cheng
Hao-Lun Sun
Jiawen Deng
Fei Mi
Yasheng Wang
Lifeng Shang
Minlie Huang
SILM
32
8
0
04 Dec 2022
Red Teaming Language Models with Language Models
Ethan Perez
Saffron Huang
Francis Song
Trevor Cai
Roman Ring
John Aslanides
Amelia Glaese
Nat McAleese
G. Irving
AAML
13
609
0
07 Feb 2022
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
280
1,587
0
18 Sep 2019
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
320
11,681
0
09 Mar 2017
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,743
0
26 Sep 2016
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
214
1,327
0
05 Jun 2016
1