ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.02534
  4. Cited By
An Empirical Study of Tokenization Strategies for Various Korean NLP
  Tasks

An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks

6 October 2020
Kyubyong Park
Joohong Lee
Seongbo Jang
Dawoon Jung
ArXivPDFHTML

Papers citing "An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks"

9 / 9 papers shown
Title
KpopMT: Translation Dataset with Terminology for Kpop Fandom
KpopMT: Translation Dataset with Terminology for Kpop Fandom
JiWoo Kim
Yunsu Kim
Jinyeong Bak
23
1
0
10 Jul 2024
Can Perplexity Predict Fine-Tuning Performance? An Investigation of
  Tokenization Effects on Sequential Language Models for Nepali
Can Perplexity Predict Fine-Tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali
Nishant Luitel
Nirajan Bekoju
Anand Kumar Sah
Subarna Shakya
58
1
0
28 Apr 2024
Different Tokenization Schemes Lead to Comparable Performance in Spanish
  Number Agreement
Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement
Catherine Arnett
Pamela D. Rivière
Tyler A. Chang
Sean Trott
26
2
0
20 Mar 2024
Data-Driven Approach for Formality-Sensitive Machine Translation:
  Language-Specific Handling and Synthetic Data Generation
Data-Driven Approach for Formality-Sensitive Machine Translation: Language-Specific Handling and Synthetic Data Generation
Seugnjun Lee
Hyeonseok Moon
Chanjun Park
Heu-Jeoung Lim
34
0
0
26 Jun 2023
From Words to Music: A Study of Subword Tokenization Techniques in
  Symbolic Music Generation
From Words to Music: A Study of Subword Tokenization Techniques in Symbolic Music Generation
Adarsh Kumar
Pedro Sarmento
41
4
0
18 Apr 2023
Language Modelling with Pixels
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
43
46
0
14 Jul 2022
Impact of Tokenization on Language Models: An Analysis for Turkish
Impact of Tokenization on Language Models: An Analysis for Turkish
Cagri Toraman
E. Yilmaz
Furkan Şahinuç
Oguzhan Ozcelik
38
74
0
19 Apr 2022
Between words and characters: A Brief History of Open-Vocabulary
  Modeling and Tokenization in NLP
Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
Sabrina J. Mielke
Zaid Alyafeai
Elizabeth Salesky
Colin Raffel
Manan Dey
...
Arun Raja
Chenglei Si
Wilson Y. Lee
Benoît Sagot
Samson Tan
34
143
0
20 Dec 2021
What Changes Can Large-scale Language Models Bring? Intensive Study on
  HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Boseop Kim
Hyoungseok Kim
Sang-Woo Lee
Gichang Lee
Donghyun Kwak
...
Jaewook Kang
Inho Kang
Jung-Woo Ha
W. Park
Nako Sung
VLM
249
121
0
10 Sep 2021
1