131
0

Magistral

Main:17 Pages
17 Figures
Bibliography:3 Pages
7 Tables
Appendix:3 Pages
Abstract

We introduce Magistral, Mistral's first reasoning model and our own scalable reinforcement learning (RL) pipeline. Instead of relying on existing implementations and RL traces distilled from prior models, we follow a ground up approach, relying solely on our own models and infrastructure. Notably, we demonstrate a stack that enabled us to explore the limits of pure RL training of LLMs, present a simple method to force the reasoning language of the model, and show that RL on text data alone maintains most of the initial checkpoint's capabilities. We find that RL on text maintains or improves multimodal understanding, instruction following and function calling. We present Magistral Medium, trained for reasoning on top of Mistral Medium 3 with RL alone, and we open-source Magistral Small (Apache 2.0) which further includes cold-start data from Magistral Medium.

View on arXiv
@article{mistral-ai2025_2506.10910,
  title={ Magistral },
  author={ Mistral-AI and Abhinav Rastogi and Albert Q. Jiang and Andy Lo and Gabrielle Berrada and Guillaume Lample and Jason Rute and Joep Barmentlo and Karmesh Yadav and Kartik Khandelwal and Khyathi Raghavi Chandu and Léonard Blier and Lucile Saulnier and Matthieu Dinot and Maxime Darrin and Neha Gupta and Roman Soletskyi and Sagar Vaze and Teven Le Scao and Yihan Wang and Adam Yang and Alexander H. Liu and Alexandre Sablayrolles and Amélie Héliou and Amélie Martin and Andy Ehrenberg and Anmol Agarwal and Antoine Roux and Arthur Darcet and Arthur Mensch and Baptiste Bout and Baptiste Rozière and Baudouin De Monicault and Chris Bamford and Christian Wallenwein and Christophe Renaudin and Clémence Lanfranchi and Darius Dabert and Devon Mizelle and Diego de las Casas and Elliot Chane-Sane and Emilien Fugier and Emma Bou Hanna and Gauthier Delerce and Gauthier Guinet and Georgii Novikov and Guillaume Martin and Himanshu Jaju and Jan Ludziejewski and Jean-Hadrien Chabran and Jean-Malo Delignon and Joachim Studnia and Jonas Amar and Josselin Somerville Roberts and Julien Denize and Karan Saxena and Kush Jain and Lingxiao Zhao and Louis Martin and Luyu Gao and Lélio Renard Lavaud and Marie Pellat and Mathilde Guillaumin and Mathis Felardos and Maximilian Augustin and Mickaël Seznec and Nikhil Raghuraman and Olivier Duchenne and Patricia Wang and Patrick von Platen and Patryk Saffer and Paul Jacob and Paul Wambergue and Paula Kurylowicz and Pavankumar Reddy Muddireddy and Philomène Chagniot and Pierre Stock and Pravesh Agrawal and Romain Sauvestre and Rémi Delacourt and Sanchit Gandhi and Sandeep Subramanian and Shashwat Dalal and Siddharth Gandhi and Soham Ghosh and Srijan Mishra and Sumukh Aithal and Szymon Antoniak and Thibault Schueller and Thibaut Lavril and Thomas Robert and Thomas Wang and Timothée Lacroix and Valeriia Nemychnikova and Victor Paltz and Virgile Richard and Wen-Ding Li and William Marshall and Xuanyu Zhang and Yunhao Tang },
  journal={arXiv preprint arXiv:2506.10910},
  year={ 2025 }
}
Comments on this paper