Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.06508
Cited By
On the Effect of (Near) Duplicate Subwords in Language Modelling
9 April 2024
Anton Schäfer
Thomas Hofmann
Imanol Schlag
Tiago Pimentel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Effect of (Near) Duplicate Subwords in Language Modelling"
7 / 7 papers shown
Title
Lexical Generalization Improves with Larger Models and Longer Training
Elron Bandel
Yoav Goldberg
Yanai Elazar
52
6
0
23 Oct 2022
How BPE Affects Memorization in Transformers
Eugene Kharitonov
Marco Baroni
Dieuwke Hupkes
163
32
0
06 Oct 2021
Frequency Effects on Syntactic Rule Learning in Transformers
Jason W. Wei
Dan Garrette
Tal Linzen
Ellie Pavlick
88
62
0
14 Sep 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
256
1,996
0
31 Dec 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
245
31,257
0
16 Jan 2013
1