32
16

Ensemble Methods for Survival Data with Time-Varying Covariates

Abstract

Survival data with time-varying covariates are common in practice. However, the traditional survival forests - conditional inference forest, relative risk forest and random survival forest - have accommodated only time-invariant covariates. Similarly, the recently proposed transformation forest, which incorporates the split statistics suitable for non-proportional hazard settings, has employed only time-invariant covariates. We generalize the conditional inference and relative risk forests to allow time-varying covariates. We compare their performance with that of the Cox model and transformation forest, adapted to accommodate time-varying covariates, through a comprehensive simulation study in which the Kaplan-Meier estimate serves as a benchmark. In general, the performance of the two proposed forests substantially improves over the Kaplan-Meier estimate when the estimation conditions become more favorable. Taking into an account all other factors, under the PH setting, the best method is always one of the two proposed forests, while under the non-PH setting, it is the adapted transformation forest. The K-fold cross-validation can be an effective tool to choose between the methods in practice. Finally, the performance of the proposed forest methods for time-invariant covariate data is broadly similar to that found for time-varying covariate data. We also propose a general framework for estimation of a survival function in the presence of time-varying covariates, which can be applied to any method that uses the counting process (pseudo-subject) approach to handling time-varying covariates. This novel estimate of a single survival function takes multiple survival estimation outputs corresponding to each pseudo-subject, and combines them in a theoretically-justified way to form a proper monotone-decreasing survival function estimate.

View on arXiv
Comments on this paper