Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.10879
Cited By
Evaluating Persian Tokenizers
22 February 2022
Danial Kamali
Behrooz Janfada
Mohammad Ebrahim Shenasa
B. Minaei-Bidgoli
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Evaluating Persian Tokenizers"
3 / 3 papers shown
Title
Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Jungo Kasai
David R. Mortensen
Noah A. Smith
Yulia Tsvetkov
51
81
0
23 May 2023
Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing
Minh Nguyen
Viet Dac Lai
Amir Pouran Ben Veyseh
Thien Huu Nguyen
52
132
0
09 Jan 2021
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
Peng Qi
Yuhao Zhang
Yuhui Zhang
Jason Bolton
Christopher D. Manning
AI4TS
213
1,654
0
16 Mar 2020
1