Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.10951
Cited By
Automatic Document Selection for Efficient Encoder Pretraining
20 October 2022
Yukun Feng
Patrick Xia
Benjamin Van Durme
João Sedoc
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Automatic Document Selection for Efficient Encoder Pretraining"
12 / 12 papers shown
Title
Quantifying Memorization Across Neural Language Models
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
100
614
0
15 Feb 2022
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
205
5,454
0
07 Jul 2021
On the Importance of Effectively Adapting Pretrained Language Models for Active Learning
Katerina Margatina
Loïc Barrault
Nikolaos Aletras
47
38
0
16 Apr 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
432
2,081
0
31 Dec 2020
BERTweet: A pre-trained language model for English Tweets
Dat Quoc Nguyen
Thanh Tien Vu
A. Nguyen
VLM
82
914
0
20 May 2020
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
134
2,420
0
23 Apr 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
539
4,773
0
23 Jan 2020
How Can We Know What Language Models Know?
Zhengbao Jiang
Frank F. Xu
Jun Araki
Graham Neubig
KELM
126
1,402
0
28 Nov 2019
What Does BERT Look At? An Analysis of BERT's Attention
Kevin Clark
Urvashi Khandelwal
Omer Levy
Christopher D. Manning
MILM
209
1,592
0
11 Jun 2019
What do you learn from context? Probing for sentence structure in contextualized word representations
Ian Tenney
Patrick Xia
Berlin Chen
Alex Jinpeng Wang
Adam Poliak
...
Najoung Kim
Benjamin Van Durme
Samuel R. Bowman
Dipanjan Das
Ellie Pavlick
173
858
0
15 May 2019
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
129
1,469
0
15 May 2019
SciBERT: A Pretrained Language Model for Scientific Text
Iz Beltagy
Kyle Lo
Arman Cohan
120
2,957
0
26 Mar 2019
1