
Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation
Papers citing "Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation"
6 / 6 papers shown
Title |
---|
![]() Evaluating Large Language Models on the Spanish Medical Intern Resident (MIR) Examination 2024/2025:A Comparative Analysis of Clinical Reasoning and Knowledge Application |