30

OpenTME: An Open Dataset of AI-powered H&E Tumor Microenvironment Profiles from TCGA

Maaike Galama
Nina Kozar-Gillan
Christina Embacher
Todd Dembo
Cornelius Böhm
Evelyn Ramberger
Julika Ribbat-Idel
Rosemarie Krupar
Verena Aumiller
Miriam Hägele
Kai Standvoss
Gerrit Erdmann
Blanca Pablos
Ari Angelo
Simon Schallenberg
Andrew Norgan
Viktor Matyas
Klaus-Robert Müller
Maximilian Alber
Lukas Ruff
Frederick Klauschen
Main:4 Pages
1 Figures
Bibliography:3 Pages
8 Tables
Appendix:5 Pages
Abstract

The tumor microenvironment (TME) plays a central role in cancer progression, treatment response, and patient outcomes, yet large-scale, consistent, and quantitative TME characterization from routine hematoxylin and eosin (H&E)-stained histopathology remains scarce. We introduce OpenTME, an open-access dataset of pre-computed TME profiles derived from 3,634 H&E-stained whole-slide images across five cancer types (bladder, breast, colorectal, liver, and lung cancer) from The Cancer Genome Atlas (TCGA). All outputs were generated using Atlas H&E-TME, an AI-powered application built on the Atlas family of pathology foundation models, which performs tissue quality control, tissue segmentation, cell detection and classification, and spatial neighborhood analysis, yielding over 4,500 quantitative readouts per slide at cell-level resolution. OpenTME is available for non-commercial academic research on Hugging Face. We will continue to expand OpenTME over time and anticipate it will serve as a resource for biomarker discovery, spatial biology research, and the development of computational methods for TME analysis.

View on arXiv
Comments on this paper