Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.10815
Cited By
A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning
22 April 2022
Md. Mofijul Islam
Gustavo Aguilar
Pragaash Ponnusamy
Clint Solomon Mathialagan
Chengyuan Ma
Chenlei Guo
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning"
4 / 4 papers shown
Title
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
40
2
0
28 Oct 2024
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
38
46
0
14 Jul 2022
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
77
65
0
24 Oct 2020
Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality
Gustavo Aguilar
Bryan McCann
Tong Niu
Nazneen Rajani
N. Keskar
Thamar Solorio
49
12
0
24 Oct 2020
1