We introduce LongCat-Flash, a 560-billion-parameter Mixture-of-Experts (MoE) language model designed for both computational efficiency and advanced agentic capabilities. Stemming from the need for scalable efficiency, LongCat-Flash adopts two novel designs: (a) Zero-computation Experts, which enables dynamic computational budget allocation and activates 18.6B-31.3B (27B on average) per token depending on contextual demands, optimizing resource usage. (b) Shortcut-connected MoE, which enlarges the computation-communication overlap window, demonstrating notable gains in inference efficiency and throughput compared to models of a comparable scale. We develop a comprehensive scaling framework for large models that combines hyperparameter transfer, model-growth initialization, a multi-pronged stability suite, and deterministic computation to achieve stable and reproducible training. Notably, leveraging the synergy among scalable architectural design and infrastructure efforts, we complete model training on more than 20 trillion tokens within 30 days, while achieving over 100 tokens per second (TPS) for inference at a cost of \0.70permillionoutputtokens.TocultivateLongCat−Flashtowardsagenticintelligence,weconductalarge−scalepre−trainingonoptimizedmixtures,followedbytargetedmid−andpost−trainingonreasoning,code,andinstructions,withfurtheraugmentationfromsyntheticdataandtoolusetasks.Comprehensiveevaluationsdemonstratethat,asanon−thinkingfoundationmodel,LongCat−Flashdelivershighlycompetitiveperformanceamongotherleadingmodels,withexceptionalstrengthsinagentictasks.ThemodelcheckpointofLongCat−Flashisopen−sourcedtofostercommunityresearch.