48

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Ailin Huang
Ang Li
Aobo Kong
Bin Wang
Binxing Jiao
Bo Dong
Bojun Wang
Boyu Chen
Brian Li
Buyun Ma
Chang Su
Changxin Miao
Changyi Wan
Chao Lou
Chen Hu
Chen Xu
Chenfeng Yu
Chengting Feng
Chengyuan Yao
Chunrui Han
Dan Ma
Dapeng Shi
Daxin Jiang
Dehua Ma
Deshan Sun
Di Qi
Enle Liu
Fajie Zhang
Fanqi Wan
Guanzhe Huang
Gulin Yan
Guoliang Cao
Guopeng Li
Han Cheng
Hangyu Guo
Hanshan Zhang
Hao Nie
Haonan Jia
Haoran Lv
Hebin Zhou
Hekun Lv
Heng Wang
Heung-Yeung Shum
Hongbo Huang
Hongbo Peng
Hongyu Zhou
Hongyuan Wang
Houyong Chen
Huangxi Zhu
Huimin Wu
Huiyong Guo
Jia Wang
Jian Zhou
Jianjian Sun
Jiaoren Wu
Jiaran Zhang
Jiashu Lv
Jiashuo Liu
Jiayi Fu
Jiayu Liu
Jie Cheng
Jie Luo
Jie Yang
Jie Zhou
Jieyi Hou
Jing Bai
Jingcheng Hu
Jingjing Xie
Jingwei Wu
Jingyang Zhang
Jishi Zhou
Junfeng Liu
Junzhe Lin
Ka Man Lo
Kai Liang
Kaibo Liu
Kaijun Tan
Kaiwen Yan
Kaixiang Li
Kang An
Kangheng Lin
Lei Yang
Liang Lv
Liang Zhao
Liangyu Chen
Lieyu Shi
Liguo Tan
Lin Lin
Lina Chen
Luck Ma
Mengqiang Ren
Michael Li
Ming Li
Mingliang Li
Mingming Zhang
Mingrui Chen
Mitt Huang
Na Wang
Peng Liu
Qi Han
Qian Zhao
Qinglin He
Qinxin Du
Qiuping Wu
Quan Sun
Rongqiu Yang
Ruihang Miao
Ruixin Han
Ruosi Wan
Ruyan Guo
Shan Wang
Shaoliang Pang
Shaowen Yang
Shengjie Fan
Shijie Shang
Shiliang Yang
Shiwei Li
Shuangshuang Tian
Siqi Liu
Siye Wu
Siyu Chen
Song Yuan
Tiancheng Cao
Tianchi Yue
Tianhao Cheng
Tianning Li
Tingdan Luo
Wang You
Wei Ji
Wei Yuan
Wei Zhang
Weibo Wu
Weihao Xie
Wen Sun
Wenjin Deng
Wenzhen Zheng
Wuxun Xie
Xiangfeng Wang
Xiangwen Kong
Xiangyu Liu
Xiangyu Zhang
Xiaobo Yang
Xiaojia Liu
Xiaolan Yuan
Xiaoran Jiao
Xiaoxiao Ren
Xiaoyun Zhang
Xin Li
Xin Liu
Xin Wu
Xing Chen
Xingping Yang
Xinran Wang
Xu Zhao
Xuan He
Xuanti Feng
Xuedan Cai
Xuqiang Zhou
Yanbo Yu
Yang Li
Yang Xu
Yanlin Lai
Yanming Xu
Yaoyu Wang
Yeqing Shen
Yibo Zhu
Yichen Lv
Yicheng Cao
Yifeng Gong
Yijing Yang
Yikun Yang
Yin Zhao
Yingxiu Zhao
Yinmin Zhang
Yitong Zhang
Yixuan Zhang
Yiyang Chen
Yongchi Zhao
Yongshen Long
Yongyao Wang
Yousong Guan
Yu Zhou
Yuang Peng
Yuanhao Ding
Yuantao Fan
Yuanzhen Yang
Yuchu Luo
Yudi Zhao
Yue Peng
Yueqiang Lin
Yufan Lu
Yuling Zhao
Yunzhou Ju
Yurong Zhang
Yusheng Li
Yuxiang Yang
Yuyang Chen
Yuzhu Cai
Zejia Weng
Zetao Hong
Zexi Li
Zhe Xie
Zheng Ge
Zheng Gong
Zheng Zeng
Zhenyi Lu
Zhewei Huang
Zhichao Chang
Zhiguo Huang
Zhiheng Hu
Zidong Yang
Zili Wang
Ziqi Ren
Zixin Zhang
Zixuan Wang
et al. (115 additional authors not shown)
Main:54 Pages
10 Figures
Bibliography:13 Pages
26 Tables
Abstract

We introduce Step 3.5 Flash, a sparse Mixture-of-Experts (MoE) model that bridges frontier-level agentic intelligence and computational efficiency. We focus on what matters most when building agents: sharp reasoning and fast, reliable execution. Step 3.5 Flash pairs a 196B-parameter foundation with 11B active parameters for efficient inference. It is optimized with interleaved 3:1 sliding-window/full attention and Multi-Token Prediction (MTP-3) to reduce the latency and cost of multi-round agentic interactions. To reach frontier-level intelligence, we design a scalable reinforcement learning framework that combines verifiable signals with preference feedback, while remaining stable under large-scale off-policy training, enabling consistent self-improvement across mathematics, code, and tool use. Step 3.5 Flash demonstrates strong performance across agent, coding, and math tasks, achieving 85.4% on IMO-AnswerBench, 86.4% on LiveCodeBench-v6 (2024.08-2025.05), 88.2% on tau2-Bench, 69.0% on BrowseComp (with context management), and 51.0% on Terminal-Bench 2.0, comparable to frontier models such as GPT-5.2 xHigh and Gemini 3.0 Pro. By redefining the efficiency frontier, Step 3.5 Flash provides a high-density foundation for deploying sophisticated agents in real-world industrial environments.

View on arXiv
Comments on this paper