ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.02143
23
1
v1v2v3 (latest)

MLOmics: Cancer Multi-Omics Database for Machine Learning

2 September 2024
Ziwei Yang
Rikuto Kotoge
Zheng Chen
Xihao Piao
Lingwei Zhu
Peng Gao
Yasuko Matsubara
Yasushi Sakurai
Jimeng Sun
    ELM
ArXiv (abs)PDFHTML
Main:4 Pages
4 Figures
Bibliography:3 Pages
3 Tables
Appendix:17 Pages
Abstract

Framing the investigation of diverse cancers as a machine learning problem has recently shown significant potential in multi-omics analysis and cancer research. Empowering these successful machine learning models are the high-quality training datasets with sufficient data volume and adequate preprocessing. However, while there exist several public data portals, including The Cancer Genome Atlas (TCGA) multi-omics initiative or open-bases such as the LinkedOmics, these databases are not off-the-shelf for existing machine learning models. In this paper, we introduce MLOmics, an open cancer multi-omics database aiming at serving better the development and evaluation of bioinformatics and machine learning models. MLOmics contains 8,314 patient samples covering all 32 cancer types with four omics types, stratified features, and extensive baselines. Complementary support for downstream analysis and bio-knowledge linking are also included to support interdisciplinary analysis.

View on arXiv
Comments on this paper