Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.06377
Cited By
Heaps' Law in GPT-Neo Large Language Model Emulated Corpora
10 November 2023
Uyen Lai
Gurjit S. Randhawa
Paul Sheridan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Heaps' Law in GPT-Neo Large Language Model Emulated Corpora"
1 / 1 papers shown
Title
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
282
2,000
0
31 Dec 2020
1