Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.07284
Cited By
MANTa: Efficient Gradient-Based Tokenization for Robust End-to-End Language Modeling
14 December 2022
Nathan Godey
Roman Castagné
Eric Villemonte de la Clergerie
Benoît Sagot
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MANTa: Efficient Gradient-Based Tokenization for Robust End-to-End Language Modeling"
5 / 5 papers shown
Title
Elementwise Language Representation
Du-Yeong Kim
Jeeeun Kim
33
0
0
27 Feb 2023
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
80
235
0
31 Dec 2020
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
Hicham El Boukkouri
Olivier Ferret
Thomas Lavergne
Hiroshi Noji
Pierre Zweigenbaum
Junichi Tsujii
77
156
0
20 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
1