37
v1v2 (latest)

A.X K1 Technical Report

Sung Jun Cheon
Jaekyung Cho
Seongho Choi
Hyunjun Eun
Seokhwan Jo
Jaehyun Jun
Minsoo Kang
Jin Kim
Jiwon Kim
Minsang Kim
Sungwan Kim
Seungsik Kim
Tae Yoon Kim
Youngrang Kim
Hyeongmun Lee
Sangyeol Lee
Sungeun Lee
Youngsoon Lee
Yujin Lee
Seongmin Ok
Chanyong Park
Hyewoong Park
Junyoung Park
Hyunho Yang
Subin Yi
Soohyun Bae
Dhammiko Arya
Yongseok Choi
Sangho Choi
Dongyeon Cho
Seungmo Cho
Gyoungeun Han
Yong-jin Han
Seokyoung Hong
Hyeon Hwang
Wonbeom Jang
Minjeong Ju
Wonjin Jung
Keummin Ka
Sungil Kang
Dongnam Kim
Joonghoon Kim
Jonghwi Kim
SaeRom Kim
Sangjin Kim
Seongwon Kim
Youngjin Kim
Seojin Lee
Sunwoo Lee
Taehoon Lee
Chanwoo Park
Sohee Park
Sooyeon Park
Yohan Ra
Sereimony Sek
Seungyeon Seo
Gun Song
Sanghoon Woo
Janghan Yoon
Sungbin Yoon
Main:16 Pages
6 Figures
Bibliography:4 Pages
8 Tables
Abstract

We introduce A.X K1, a 519B-parameter Mixture-of-Experts (MoE) language model trained from scratch. Our design leverages scaling laws to optimize training configurations and vocabulary size under fixed computational budgets. A.X K1 is pre-trained on a corpus of approximately 10T tokens, curated by a multi-stage data processing pipeline. Designed to bridge the gap between reasoning capability and inference efficiency, A.X K1 supports explicitly controllable reasoning to facilitate scalable deployment across diverse real-world scenarios. We propose a simple yet effective Think-Fusion training recipe, enabling user-controlled switching between thinking and non-thinking modes within a single unified model. Extensive evaluations demonstrate that A.X K1 achieves performance competitive with leading open-source models, while establishing a distinctive advantage in Korean-language benchmarks.

View on arXiv
Comments on this paper