68

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Yicheng Zou
Dongsheng Zhu
Lin Zhu
Tong Zhu
Yunhua Zhou
Peiheng Zhou
Xinyu Zhou
Dongzhan Zhou
Zhiwang Zhou
Yuhao Zhou
Bowen Zhou
Zhanping Zhong
Zhijie Zhong
Haiteng Zhao
Penghao Zhao
Xiaomeng Zhao
Zhiyuan Zhao
Yechen Zhang
Jin Zhang
Wenwei Zhang
Hongjie Zhang
Zhuo Zhang
Wenlong Zhang
Bo Zhang
Chao Zhang
Chen Zhang
Yuhang Zang
Fei Yuan
Jiakang Yuan
Jiashuo Yu
Jinhui Yin
Haochen Ye
Qian Yao
Bowen Yang
Danni Yang
Kaichen Yang
Ziang Yan
Jun Xu
Yicheng Xu
Wanghan Xu
Xuenan Xu
Chao Xu
Ruiliang Xu
Shuhao Xing
Long Xing
Xinchen Xie
Ling-I Wu
Zijian Wu
Zhenyu Wu
Lijun Wu
Yue Wu
Jianyu Wu
Wen Wu
Fan Wu
Xilin Wei
Qi Wei
Bingli Wang
Rui Wang
Ziyi Wang
Zun Wang
Yi Wang
Haomin Wang
Yizhou Wang
Lintao Wang
Yiheng Wang
Longjiang Wang
Bin Wang
Jian Tong
Zhongbo Tian
Huanze Tang
Chen Tang
Shixiang Tang
Yu Sun
Qiushi Sun
Xuerui Su
Qisheng Su
Chenlin Su
Demin Song
Jin Shi
Fukai Shang
Yuchen Ren
Pengli Ren
Xiaoye Qu
Yuan Qu
Jiantao Qiu
Yu Qiao
Biqing Qi
Runyu Peng
Tianshuo Peng
Jiahui Peng
Qizhi Pei
Zhuoshi Pan
Linke Ouyang
Wenchang Ning
Yichuan Ma
Zerun Ma
Ningsheng Ma
Runyuan Ma
Chengqi Lyu
Haijun Lv
Main:15 Pages
9 Figures
Bibliography:4 Pages
4 Tables
Abstract

We introduce Intern-S1-Pro, the first one-trillion-parameter scientific multimodal foundation model. Scaling to this unprecedented size, the model delivers a comprehensive enhancement across both general and scientific domains. Beyond stronger reasoning and image-text understanding capabilities, its intelligence is augmented with advanced agent capabilities. Simultaneously, its scientific expertise has been vastly expanded to master over 100 specialized tasks across critical science fields, including chemistry, materials, life sciences, and earth sciences. Achieving this massive scale is made possible by the robust infrastructure support of XTuner and LMDeploy, which facilitates highly efficient Reinforcement Learning (RL) training at the 1-trillion parameter level while ensuring strict precision consistency between training and inference. By seamlessly integrating these advancements, Intern-S1-Pro further fortifies the fusion of general and specialized intelligence, working as a Specializable Generalist, demonstrating its position in the top tier of open-source models for general capabilities, while outperforming proprietary models in the depth of specialized scientific tasks.

View on arXiv
Comments on this paper