Motivation: Analysis of structure-response relationships of drugs is a key to understanding off-target and unexpected drug effects, and for developing hypotheses on how to better tailor drug therapies. New methods are required for analysis of genome-wide effects of drugs across multiple cell lines, against a large number of drug characteristics. Results: In this paper, we present the first comprehensive data-driven analysis of genome-wide effects of drugs across multiple cancer cell lines (CMap database) with a probabilistic Bayesian latent variable model. The model has been designed for integrating multiple data sources, decomposing them to data-set-specific prop-erties and the interesting shared properties. We identified 11 com-ponents that link the structural descriptors with specific gene ex-pression responses observed in the three cell lines. We found both previously reported associations, but also generated new observa-tions and identified structures that may be responsible for the re-sponses. Two examples are the previously unknown role of 15-delta prostaglandin J2 as a Heat Shock Protein (90) inhibitor where 3D descriptors (Pentacle-N2) are linked to the expression, and simvas-tatin inducing anti-inflammatory responses of similar nature as corti-costeroids in leukemic cells. Contact: suleiman.khan@aalto.fi, samuel.kaski@aalto.fi
View on arXiv