ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.18934
90
1

Kanana: Compute-efficient Bilingual Language Models

26 February 2025
Kanana LLM Team
Yunju Bak
Hojin Lee
Minho Ryu
Jiyeon Ham
Seungjae Jung
D. W. Nam
Taegyeong Eo
Donghun Lee
Doohae Jung
Boseop Kim
Nayeon Kim
Jaesun Park
Hyunho Kim
H. Ko
Changmin Lee
Kyoung-Woon On
Seulye Baeg
Junrae Cho
S. Jung
Jieun Kang
EungGyun Kim
Eunhwa Kim
Byeongil Ko
Daniel Lee
Minchul Lee
M. Lee
Shinbok Lee
Gaeun Seo
ArXivPDFHTML
Abstract

We introduce Kanana, a series of bilingual language models that demonstrate exceeding performance in Korean and competitive performance in English. The computational cost of Kanana is significantly lower than that of state-of-the-art models of similar size. The report details the techniques employed during pre-training to achieve compute-efficient yet competitive models, including high quality data filtering, staged pre-training, depth up-scaling, and pruning and distillation. Furthermore, the report outlines the methodologies utilized during the post-training of the Kanana models, encompassing supervised fine-tuning and preference optimization, aimed at enhancing their capability for seamless interaction with users. Lastly, the report elaborates on plausible approaches used for language model adaptation to specific scenarios, such as embedding, retrieval augmented generation, and function calling. The Kanana model series spans from 2.1B to 32.5B parameters with 2.1B models (base, instruct, embedding) publicly released to promote research on Korean language models.

View on arXiv
@article{team2025_2502.18934,
  title={ Kanana: Compute-efficient Bilingual Language Models },
  author={ Kanana LLM Team and Yunju Bak and Hojin Lee and Minho Ryu and Jiyeon Ham and Seungjae Jung and Daniel Wontae Nam and Taegyeong Eo and Donghun Lee and Doohae Jung and Boseop Kim and Nayeon Kim and Jaesun Park and Hyunho Kim and Hyunwoong Ko and Changmin Lee and Kyoung-Woon On and Seulye Baeg and Junrae Cho and Sunghee Jung and Jieun Kang and EungGyun Kim and Eunhwa Kim and Byeongil Ko and Daniel Lee and Minchul Lee and Miok Lee and Shinbok Lee and Gaeun Seo },
  journal={arXiv preprint arXiv:2502.18934},
  year={ 2025 }
}
Comments on this paper