ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.08483
17
69

Amazon SageMaker Autopilot: a white box AutoML solution at scale

15 December 2020
Piali Das
Laurence Rouesnel
Nikita Ivkin
Tanya Bansal
Zohar Karnin
Huibin Shen
I. Shcherbatyi
L. Ramakrishnan
Wilton Wu
Aida Zolic
P. Gautier
Alex Tang
Amr Ahmed
Jean Baptiste Faddoul
Rodolphe Jenatton
Fela Winkelmolen
P. Grao
Leo Dirac
Andre Perunicic
Miroslav Miladinovic
Giovanni Zappella
Cedric Archembeau
Matthias Seeger
Bhaskar Dutt
K. Venkateswar
ArXivPDFHTML
Abstract

AutoML systems provide a black-box solution to machine learning problems by selecting the right way of processing features, choosing an algorithm and tuning the hyperparameters of the entire pipeline. Although these systems perform well on many datasets, there is still a non-negligible number of datasets for which the one-shot solution produced by each particular system would provide sub-par performance. In this paper, we present Amazon SageMaker Autopilot: a fully managed system providing an automated ML solution that can be modified when needed. Given a tabular dataset and the target column name, Autopilot identifies the problem type, analyzes the data and produces a diverse set of complete ML pipelines including feature preprocessing and ML algorithms, which are tuned to generate a leaderboard of candidate models. In the scenario where the performance is not satisfactory, a data scientist is able to view and edit the proposed ML pipelines in order to infuse their expertise and business knowledge without having to revert to a fully manual solution. This paper describes the different components of Autopilot, emphasizing the infrastructure choices that allow scalability, high quality models, editable ML pipelines, consumption of artifacts of offline meta-learning, and a convenient integration with the entire SageMaker suite allowing these trained models to be used in a production setting.

View on arXiv
Comments on this paper