ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.04224
  4. Cited By
Aligners: Decoupling LLMs and Alignment
v1v2v3 (latest)

Aligners: Decoupling LLMs and Alignment

7 March 2024
Lilian Ngweta
Mayank Agarwal
Subha Maity
Alex Gittens
Yuekai Sun
Mikhail Yurochkin
ArXiv (abs)PDFHTML

Papers citing "Aligners: Decoupling LLMs and Alignment"

10 / 10 papers shown
Title
Out-of-Distribution Detection using Synthetic Data Generation
Out-of-Distribution Detection using Synthetic Data Generation
Momin Abbas
Muneeza Azmat
R. Horesh
Mikhail Yurochkin
169
1
0
05 Feb 2025
Principle-Driven Self-Alignment of Language Models from Scratch with
  Minimal Human Supervision
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
Zhiqing Sun
Songlin Yang
Qinhong Zhou
Hongxin Zhang
Zhenfang Chen
David D. Cox
Yiming Yang
Chuang Gan
SyDaALM
99
337
0
04 May 2023
Constitutional AI: Harmlessness from AI Feedback
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDaMoMe
209
1,640
0
15 Dec 2022
Repair Is Nearly Generation: Multilingual Program Repair with LLMs
Repair Is Nearly Generation: Multilingual Program Repair with LLMs
Harshit Joshi
J. Cambronero
Sumit Gulwani
Vu Le
Ivan Radicek
Gust Verbruggen
LRM
56
134
0
24 Aug 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
886
13,207
0
04 Mar 2022
Ethical and social risks of harm from Language Models
Ethical and social risks of harm from Language Models
Laura Weidinger
John F. J. Mellor
Maribeth Rauh
Conor Griffin
J. Uesato
...
Lisa Anne Hendricks
William S. Isaac
Sean Legassick
G. Irving
Iason Gabriel
PILM
122
1,042
0
08 Dec 2021
WILDS: A Benchmark of in-the-Wild Distribution Shifts
WILDS: A Benchmark of in-the-Wild Distribution Shifts
Pang Wei Koh
Shiori Sagawa
Henrik Marklund
Sang Michael Xie
Marvin Zhang
...
A. Kundaje
Emma Pierson
Sergey Levine
Chelsea Finn
Percy Liang
OOD
227
1,445
0
14 Dec 2020
Energy-based Out-of-distribution Detection
Energy-based Out-of-distribution Detection
Weitang Liu
Xiaoyun Wang
John Douglas Owens
Yixuan Li
OODD
271
1,374
0
08 Oct 2020
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
  Models
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
163
1,214
0
24 Sep 2020
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
218
3,365
0
12 Jun 2017
1