Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.00612
Cited By
Eight Things to Know about Large Language Models
2 April 2023
Sam Bowman
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Eight Things to Know about Large Language Models"
36 / 36 papers shown
Title
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Teun van der Weij
Felix Hofstätter
Ollie Jaffe
Samuel F. Brown
Francis Rhys Ward
ELM
76
29
0
11 Jun 2024
Large Language Models
Michael R Douglas
LLMAG
LM&MA
138
628
0
11 Jul 2023
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
196
1,618
0
15 Dec 2022
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
131
375
0
07 Dec 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
189
3,128
0
20 Oct 2022
The Alignment Problem from a Deep Learning Perspective
Richard Ngo
Lawrence Chan
Sören Mindermann
103
192
0
30 Aug 2022
What Do NLP Researchers Believe? Results of the NLP Community Metasurvey
Julian Michael
Ari Holtzman
Alicia Parrish
Aaron Mueller
Alex Jinpeng Wang
...
Divyam Madaan
Nikita Nangia
Richard Yuanzhe Pang
Jason Phang
Sam Bowman
59
39
0
26 Aug 2022
Language Model Cascades
David Dohan
Winnie Xu
Aitor Lewkowycz
Jacob Austin
David Bieber
...
Henryk Michalewski
Rif A. Saurous
Jascha Narain Sohl-Dickstein
Kevin Patrick Murphy
Charles Sutton
ReLM
LRM
92
102
0
21 Jul 2022
Language Models (Mostly) Know What They Know
Saurav Kadavath
Tom Conerly
Amanda Askell
T. Henighan
Dawn Drain
...
Nicholas Joseph
Benjamin Mann
Sam McCandlish
C. Olah
Jared Kaplan
ELM
117
817
0
11 Jul 2022
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
279
2,480
0
15 Jun 2022
Self-critiquing models for assisting human evaluators
William Saunders
Catherine Yeh
Jeff Wu
Steven Bills
Ouyang Long
Jonathan Ward
Jan Leike
ALM
ELM
103
302
0
12 Jun 2022
When a sentence does not introduce a discourse entity, Transformer-based models still sometimes refer to it
Sebastian Schuster
Tal Linzen
62
25
0
06 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
382
3,542
0
29 Apr 2022
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
489
6,240
0
05 Apr 2022
Teaching language models to support answers with verified quotes
Jacob Menick
Maja Trebacz
Vladimir Mikulik
John Aslanides
Francis Song
...
Mia Glaese
Susannah Young
Lucy Campbell-Gillingham
G. Irving
Nat McAleese
ELM
RALM
297
265
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
817
9,387
0
28 Jan 2022
Show Your Work: Scratchpads for Intermediate Computation with Language Models
Maxwell Nye
Anders Andreassen
Guy Gur-Ari
Henryk Michalewski
Jacob Austin
...
Aitor Lewkowycz
Maarten Bosma
D. Luan
Charles Sutton
Augustus Odena
ReLM
LRM
177
746
0
30 Nov 2021
Shaking the foundations: delusions in sequence models for interaction and control
Pedro A. Ortega
M. Kunesch
Grégoire Delétang
Tim Genewein
Jordi Grau-Moya
...
Yutian Chen
Scott E. Reed
Marcus Hutter
Nando de Freitas
Shane Legg
77
64
0
20 Oct 2021
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail
Sam Bowman
OffRL
76
45
0
15 Oct 2021
Sorting through the noise: Testing robustness of information processing in pre-trained language models
Lalchand Pandia
Allyson Ettinger
80
37
0
25 Sep 2021
Do Language Models Know the Way to Rome?
Bastien Liétard
Mostafa Abdou
Anders Søgaard
90
19
0
16 Sep 2021
Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color
Mostafa Abdou
Artur Kulmizev
Daniel Hershcovich
Stella Frank
Ellie Pavlick
Anders Søgaard
71
123
0
13 Sep 2021
Do Prompt-Based Models Really Understand the Meaning of their Prompts?
Albert Webson
Ellie Pavlick
LRM
107
371
0
02 Sep 2021
The Values Encoded in Machine Learning Research
Abeba Birhane
Pratyusha Kalluri
Dallas Card
William Agnew
Ravit Dotan
Michelle Bao
71
286
0
29 Jun 2021
A Survey of Race, Racism, and Anti-Racism in NLP
Anjalie Field
Su Lin Blodgett
Zeerak Talat
Yulia Tsvetkov
79
124
0
21 Jun 2021
Implicit Representations of Meaning in Neural Language Models
Belinda Z. Li
Maxwell Nye
Jacob Andreas
NAI
MILM
60
176
0
01 Jun 2021
Underspecification Presents Challenges for Credibility in Modern Machine Learning
Alexander DÁmour
Katherine A. Heller
D. Moldovan
Ben Adlam
B. Alipanahi
...
Kellie Webster
Steve Yadlowsky
T. Yun
Xiaohua Zhai
D. Sculley
OffRL
117
686
0
06 Nov 2020
Hidden Incentives for Auto-Induced Distributional Shift
David M. Krueger
Tegan Maharaj
Jan Leike
67
51
0
19 Sep 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
161
2,737
0
05 Jun 2020
Pretrained Transformers Improve Out-of-Distribution Robustness
Dan Hendrycks
Xiaoyuan Liu
Eric Wallace
Adam Dziedzic
R. Krishnan
D. Song
OOD
191
434
0
13 Apr 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
605
4,822
0
23 Jan 2020
Optimal Policies Tend to Seek Power
Alexander Matt Turner
Logan Smith
Rohin Shah
Andrew Critch
Prasad Tadepalli
56
70
0
03 Dec 2019
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
466
1,734
0
18 Sep 2019
Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack
Emily Dinan
Samuel Humeau
Bharath Chintagunta
Jason Weston
81
246
0
17 Aug 2019
Risks from Learned Optimization in Advanced Machine Learning Systems
Evan Hubinger
Chris van Merwijk
Vladimir Mikulik
Joar Skalse
Scott Garrabrant
89
152
0
05 Jun 2019
Attention is not Explanation
Sarthak Jain
Byron C. Wallace
FAtt
145
1,324
0
26 Feb 2019
1