Learning New Skills after Deployment: Improving open-domain
internet-driven dialogue with human feedback

Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback

5 August 2022

Jason Weston

Papers citing "Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback"

15 / 15 papers shown

Title
Reasons to Reject? Aligning Language Models with Judgments Weiwen Xu Deng Cai Zhisong Zhang Wai Lam Shuming Shi ALM 21 14 0 22 Dec 2023
Learning to love diligent trolls: Accounting for rater effects in the dialogue safety task M. Ilagan 36 0 0 30 Oct 2023
Let Me Teach You: Pedagogical Foundations of Feedback for Language Models Beatriz Borges Niket Tandon Tanja Kaser Antoine Bosselut 22 3 0 01 Jul 2023
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training Zeqiu Wu Yushi Hu Weijia Shi Nouha Dziri Alane Suhr Prithviraj Ammanabrolu Noah A. Smith Mari Ostendorf Hannaneh Hajishirzi ALM 30 304 0 02 Jun 2023
Continually Improving Extractive QA via Human Feedback Ge Gao Hung-Ting Chen Yoav Artzi Eunsol Choi 26 12 0 21 May 2023
Training Language Models with Language Feedback at Scale Jérémy Scheurer Jon Ander Campos Tomasz Korbak Jun Shern Chan Angelica Chen Kyunghyun Cho Ethan Perez ALM 39 101 0 28 Mar 2023
On Improving Summarization Factual Consistency from Natural Language Feedback Yixin Liu Budhaditya Deb Milagro Teruel Aaron L Halfaker Dragomir R. Radev Ahmed Hassan Awadallah HILM 27 35 0 20 Dec 2022
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation Chandra Bhagavatula Jena D. Hwang Doug Downey Ronan Le Bras Ximing Lu Lianhui Qin Keisuke Sakaguchi Swabha Swayamdipta Peter West Yejin Choi 21 34 0 19 Dec 2022
The CRINGE Loss: Learning what language not to model Leonard Adolphs Tianyu Gao Jing Xu Kurt Shuster Sainbayar Sukhbaatar Jason Weston MU 23 34 0 10 Nov 2022
When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels Weiyan Shi Emily Dinan Kurt Shuster Jason Weston Jing Xu 49 19 0 28 Oct 2022
Towards Boosting the Open-Domain Chatbot with Human Feedback Hua Lu Siqi Bao H. He Fan Wang Hua-Hong Wu Haifeng Wang ALM 20 18 0 30 Aug 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 313 11,953 0 04 Mar 2022
Reason first, then respond: Modular Generation for Knowledge-infused Dialogue Leonard Adolphs Kurt Shuster Jack Urbanek Arthur Szlam Jason Weston KELM LRM 204 41 0 09 Nov 2021
Internet-Augmented Dialogue Generation M. Komeili Kurt Shuster Jason Weston RALM 238 280 0 15 Jul 2021
Dialogue Learning With Human-In-The-Loop Jiwei Li Alexander H. Miller S. Chopra MarcÁurelio Ranzato Jason Weston OffRL 227 134 0 29 Nov 2016