ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.03720
  4. Cited By
Byte Pair Encoding is Suboptimal for Language Model Pretraining

Byte Pair Encoding is Suboptimal for Language Model Pretraining

7 April 2020
Kaj Bostrom
Greg Durrett
ArXivPDFHTML

Papers citing "Byte Pair Encoding is Suboptimal for Language Model Pretraining"

21 / 121 papers shown
Title
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLM
VLM
GNN
20
567
0
30 Jul 2021
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained
  Language Models for Domains
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains
Yunzhi Yao
Shaohan Huang
Wenhui Wang
Li Dong
Furu Wei
VLM
ALM
18
46
0
25 Jun 2021
Charformer: Fast Character Transformers via Gradient-based Subword
  Tokenization
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Yi Tay
Vinh Q. Tran
Sebastian Ruder
Jai Gupta
Hyung Won Chung
Dara Bahri
Zhen Qin
Simon Baumgartner
Cong Yu
Donald Metzler
51
153
0
23 Jun 2021
Evaluating Various Tokenizers for Arabic Text Classification
Evaluating Various Tokenizers for Arabic Text Classification
Zaid Alyafeai
Maged S. Al-Shaibani
Mustafa Ghaleb
Irfan Ahmad
37
41
0
14 Jun 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP:
  The Role of Sample Size and Dimensionality
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality
Adithya V Ganesan
Matthew Matero
Aravind Reddy Ravula
Huy-Hien Vu
H. Andrew Schwartz
30
35
0
07 May 2021
How (Non-)Optimal is the Lexicon?
How (Non-)Optimal is the Lexicon?
Tiago Pimentel
Irene Nikkarinen
Kyle Mahowald
Ryan Cotterell
Damián E. Blasi
35
23
0
29 Apr 2021
Multi-view Subword Regularization
Multi-view Subword Regularization
Xinyi Wang
Sebastian Ruder
Graham Neubig
27
45
0
15 Mar 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language
  Representation
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
36
210
0
11 Mar 2021
Superbizarre Is Not Superb: Derivational Morphology Improves BERT's
  Interpretation of Complex Words
Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words
Valentin Hofmann
J. Pierrehumbert
Hinrich Schütze
19
69
0
02 Jan 2021
Morphology Matters: A Multilingual Language Modeling Analysis
Morphology Matters: A Multilingual Language Modeling Analysis
Hyunji Hayley Park
Katherine J. Zhang
Coleman Haley
K. Steimel
Han Liu
Lane Schwartz
53
47
0
11 Dec 2020
Pre-training Protein Language Models with Label-Agnostic Binding Pairs
  Enhances Performance in Downstream Tasks
Pre-training Protein Language Models with Label-Agnostic Binding Pairs Enhances Performance in Downstream Tasks
Modestas Filipavicius
Matteo Manica
Joris Cadow
María Rodríguez Martínez
26
13
0
05 Dec 2020
Indic-Transformers: An Analysis of Transformer Language Models for
  Indian Languages
Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages
Kushal Kumar Jain
Adwait Deshpande
Kumar Shridhar
F. Laumann
Ayushman Dash
51
51
0
04 Nov 2020
Char2Subword: Extending the Subword Embedding Space Using Robust
  Character Compositionality
Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality
Gustavo Aguilar
Bryan McCann
Tong Niu
Nazneen Rajani
N. Keskar
Thamar Solorio
49
12
0
24 Oct 2020
Dynamic Contextualized Word Embeddings
Dynamic Contextualized Word Embeddings
Valentin Hofmann
J. Pierrehumbert
Hinrich Schütze
39
51
0
23 Oct 2020
UniCase -- Rethinking Casing in Language Models
UniCase -- Rethinking Casing in Language Models
Rafal Powalski
Tomasz Stanislawek
11
4
0
22 Oct 2020
An Empirical Study of Tokenization Strategies for Various Korean NLP
  Tasks
An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks
Kyubyong Park
Joohong Lee
Seongbo Jang
Dawoon Jung
6
60
0
06 Oct 2020
Will it Unblend?
Will it Unblend?
Yuval Pinter
Cassandra L. Jacobs
Jacob Eisenstein
21
14
0
18 Sep 2020
Automated Source Code Generation and Auto-completion Using Deep
  Learning: Comparing and Discussing Current Language-Model-Related Approaches
Automated Source Code Generation and Auto-completion Using Deep Learning: Comparing and Discussing Current Language-Model-Related Approaches
Juan Cruz-Benito
Sanjay Vishwakarma
Francisco Martín-Fernández
Ismael Faro Ibm Quantum
22
30
0
16 Sep 2020
A systematic comparison of grapheme-based vs. phoneme-based label units
  for encoder-decoder-attention models
A systematic comparison of grapheme-based vs. phoneme-based label units for encoder-decoder-attention models
Mohammad Zeineldeen
Albert Zeyer
Wei Zhou
T. Ng
Ralf Schluter
Hermann Ney
22
2
0
19 May 2020
DagoBERT: Generating Derivational Morphology with a Pretrained Language
  Model
DagoBERT: Generating Derivational Morphology with a Pretrained Language Model
Valentin Hofmann
J. Pierrehumbert
Hinrich Schütze
32
2
0
02 May 2020
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
718
6,748
0
26 Sep 2016
Previous
123