Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.00027
Cited By
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
31 December 2020
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
Charles Foster
Jason Phang
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Pile: An 800GB Dataset of Diverse Text for Language Modeling"
49 / 399 papers shown
Title
ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data
Xiaochuang Han
Yulia Tsvetkov
24
27
0
25 May 2022
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Shruti Palaskar
Akshita Bhagia
Yonatan Bisk
Florian Metze
A. Black
Ana Marasović
18
4
0
24 May 2022
Life after BERT: What do Other Muppets Understand about Language?
Vladislav Lialin
Kevin Zhao
Namrata Shivagunde
Anna Rumshisky
44
6
0
21 May 2022
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
44
65
0
20 May 2022
Modeling Exemplification in Long-form Question Answering via Retrieval
Shufan Wang
Fangyuan Xu
Laure Thompson
Eunsol Choi
Mohit Iyyer
38
10
0
19 May 2022
Entity Cloze By Date: What LMs Know About Unseen Entities
Yasumasa Onoe
Michael J.Q. Zhang
Eunsol Choi
Greg Durrett
KELM
23
49
0
05 May 2022
Language Models in the Loop: Incorporating Prompting into Weak Supervision
Ryan Smith
Jason Alan Fries
Braden Hancock
Stephen H. Bach
50
53
0
04 May 2022
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
59
3,488
0
02 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
46
3,349
0
29 Apr 2022
Indiscriminate Data Poisoning Attacks on Neural Networks
Yiwei Lu
Gautam Kamath
Yaoliang Yu
AAML
43
24
0
19 Apr 2022
mGPT: Few-Shot Learners Go Multilingual
Oleh Shliazhko
Alena Fenogenova
Maria Tikhonova
Vladislav Mikhailov
Anastasia Kozlova
Tatiana Shavrina
43
149
0
15 Apr 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
93
801
0
14 Apr 2022
InCoder: A Generative Model for Code Infilling and Synthesis
Daniel Fried
Armen Aghajanyan
Jessy Lin
Sida I. Wang
Eric Wallace
Freda Shi
Ruiqi Zhong
Wen-tau Yih
Luke Zettlemoyer
M. Lewis
SyDa
28
626
0
12 Apr 2022
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Thomas Wang
Adam Roberts
Daniel Hesslow
Teven Le Scao
Hyung Won Chung
Iz Beltagy
Julien Launay
Colin Raffel
31
167
0
12 Apr 2022
ANNA: Enhanced Language Representation for Question Answering
Changwook Jun
Hansol Jang
Myoseop Sim
Hyun Kim
Jooyoung Choi
Kyungkoo Min
Kyunghoon Bae
31
6
0
28 Mar 2022
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
Erik Nijkamp
Bo Pang
Hiroaki Hayashi
Lifu Tu
Haiquan Wang
Yingbo Zhou
Silvio Savarese
Caiming Xiong
ELM
90
974
0
25 Mar 2022
minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models
Kanishka Misra
19
58
0
24 Mar 2022
Speciesist Language and Nonhuman Animal Bias in English Masked Language Models
Masashi Takeshita
Rafal Rzepka
K. Araki
24
6
0
10 Mar 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAG
LRM
40
1,400
0
25 Feb 2022
Interpreting Language Models with Contrastive Explanations
Kayo Yin
Graham Neubig
MILM
21
77
0
21 Feb 2022
Better Together? An Evaluation of AI-Supported Code Translation
Justin D. Weisz
Michael J. Muller
Steven I. Ross
Fernando Martinez
Stephanie Houde
Mayank Agarwal
Kartik Talamadupula
John T. Richards
29
67
0
15 Feb 2022
Deduplicating Training Data Mitigates Privacy Risks in Language Models
Nikhil Kandpal
Eric Wallace
Colin Raffel
PILM
MU
51
274
0
14 Feb 2022
A Survey on Artificial Intelligence for Source Code: A Dialogue Systems Perspective
Erfan Al-Hossami
Samira Shaikh
29
6
0
10 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
53
850
0
07 Feb 2022
Unified Scaling Laws for Routed Language Models
Aidan Clark
Diego de Las Casas
Aurelia Guy
A. Mensch
Michela Paganini
...
Oriol Vinyals
Jack W. Rae
Erich Elsen
Koray Kavukcuoglu
Karen Simonyan
MoE
27
177
0
02 Feb 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Shaden Smith
M. Patwary
Brandon Norick
P. LeGresley
Samyam Rajbhandari
...
M. Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
MoE
90
730
0
28 Jan 2022
Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
Suchin Gururangan
Dallas Card
Sarah K. Drier
E. K. Gade
Leroy Z. Wang
Zeyu Wang
Luke Zettlemoyer
Noah A. Smith
175
73
0
25 Jan 2022
Towards a Cleaner Document-Oriented Multilingual Crawled Corpus
Julien Abadji
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
CLL
34
153
0
17 Jan 2022
Datasheet for the Pile
Stella Biderman
Kieran Bicheno
Leo Gao
52
35
0
13 Jan 2022
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound
Rowan Zellers
Jiasen Lu
Ximing Lu
Youngjae Yu
Yanpeng Zhao
Mohammadreza Salehi
Aditya Kusupati
Jack Hessel
Ali Farhadi
Yejin Choi
31
207
0
07 Jan 2022
Efficient Large Scale Language Modeling with Mixtures of Experts
Mikel Artetxe
Shruti Bhosale
Naman Goyal
Todor Mihaylov
Myle Ott
...
Jeff Wang
Luke Zettlemoyer
Mona T. Diab
Zornitsa Kozareva
Ves Stoyanov
MoE
61
188
0
20 Dec 2021
Learning To Retrieve Prompts for In-Context Learning
Ohad Rubin
Jonathan Herzig
Jonathan Berant
VPVLM
RALM
14
666
0
16 Dec 2021
Show, Write, and Retrieve: Entity-aware Article Generation and Retrieval
Zhongping Zhang
Yiwen Gu
Bryan A. Plummer
40
2
0
11 Dec 2021
Improving language models by retrieving from trillions of tokens
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
...
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
KELM
RALM
90
1,016
0
08 Dec 2021
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
VLM
MLLM
CLIP
12
1,377
0
03 Nov 2021
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MA
VLM
AI4CE
83
1,030
0
01 Nov 2021
Neural Program Generation Modulo Static Analysis
Rohan Mukherjee
Yeming Wen
Dipak Chaudhari
Thomas W. Reps
Swarat Chaudhuri
C. Jermaine
30
24
0
26 Oct 2021
Jurassic is (almost) All You Need: Few-Shot Meaning-to-Text Generation for Open-Domain Dialogue
Lena Reed
Cecilia Li
Angela Ramirez
Liren Wu
M. Walker
28
7
0
15 Oct 2021
Cut the CARP: Fishing for zero-shot story evaluation
Shahbuland Matiana
J. Smith
Ryan Teehan
Louis Castricato
Stella Biderman
Leo Gao
Spencer Frazier
47
16
0
06 Oct 2021
Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers
Narsimha Chilkuri
Eric Hunsberger
Aaron R. Voelker
G. Malik
C. Eliasmith
35
7
0
05 Oct 2021
Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and Closed Book QA
Manuel R. Ciosici
Joe Cecil
Alex Hedges
Dong-Ho Lee
Marjorie Freedman
R. Weischedel
25
9
0
04 Oct 2021
Language Models are Few-shot Multilingual Learners
Genta Indra Winata
Andrea Madotto
Zhaojiang Lin
Rosanne Liu
J. Yosinski
Pascale Fung
ELM
LRM
36
132
0
16 Sep 2021
Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration
Shufan Wang
Laure Thompson
Mohit Iyyer
180
66
0
13 Sep 2021
Teaching Autoregressive Language Models Complex Tasks By Demonstration
Gabriel Recchia
26
22
0
05 Sep 2021
Intersectional Bias in Causal Language Models
Liam Magee
Lida Ghahremanlou
K. Soldatić
S. Robertson
191
31
0
16 Jul 2021
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
242
593
0
14 Jul 2021
Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus
Jack Bandy
Nicholas Vincent
21
57
0
11 May 2021
Mind the Gap: Assessing Temporal Generalization in Neural Language Models
Angeliki Lazaridou
A. Kuncoro
E. Gribovskaya
Devang Agrawal
Adam Liska
...
Sebastian Ruder
Dani Yogatama
Kris Cao
Susannah Young
Phil Blunsom
VLM
35
207
0
03 Feb 2021
Previous
1
2
3
4
5
6
7
8