Events and Seminars : 2013 Seminars

SURVIVAL BUMP HUNTING FOR IDENTIFICATION AND CHARACTERIZATION OF INFORMATIVE PROGNOSTIC SUBGROUPS  

JEAN-EUDES DAZARD, PH.D.
Division of Bioinformatics
Center for Proteomics and Bioinformatics
Case Western Reserve University
TUESDAY, DECEMBER 10, 2013
2:00 p.m.– 3:00 p.m. CRB 692

Subgroup discovery can potentially provide novel insights into the etiology of a disease, and help stratification of patients by different diagnoses, prognoses and responses to therapies. Here, we derived a rule-induction peeling method called SurvivalPRIM to identify and characterize subgroups of subjects with extreme survival outcomes. Although other parametric and non-parametric survival models exist, none addresses the problem of identifying local and global extrema directly. We investigated the importance of predictor informativeness, peeling criterions, and cross-validation (CV) in several situations from simulated data. We report peeling trajectories of predictors, hazard ratios, ranking statistics, event-free probabilities, and median time-to-events against subgroup supports. Trace curves as well as Kaplan Meier survival probability curves with log-rank test p-values for these subgroups were also generated. In all cases, two CV schemes were compared in comparison to none. We also compared our approach to regression survival tree-based partitioning methods and survival semi-supervised versions of clustering and PCA procedures. Results show how critical it is to reduce the bias and over-fitting issues by CV, and how target subgroups differ with other methods. The cross-validated methodology was applied to 18 publically available clinical datasets with various disease and censored time-to-event outcomes. In several datasets, SurvivalPRIM reliably identified subsets of patients characterized by a specific profile of clinical, demographic and genetic variables with a distinct survival outcome. The successful performance of SurvivalPRIM on both simulated and clinical data will be discussed for future application to larger omics datasets and precise medical interventions. The implementation of the method will be released as an R package called ‘primSRC’ for generalized response in survival, regression and classification settings.

This is joint work with Michael Choe, MD, Case Western Reserve University; Michael Leblanc, PhD, the University of Washington; and J. Sunil Rao, PhD, the University of Miami.