Events and Seminars : 2013 Seminars

PRINCIPAL COMPONENTS ANALYSIS AND BUMP HUNTING USING PRIM

DANIEL DIAZ, PH.D.
Postdoctoral Research Associate, Division of Biostatistics
University of Miami

TUESDAY, SEPTEMBER 24, 2013
2:00 p.m.– 3:00 p.m.
CRB 692

Principal Components Analysis (PCA) is a widely used technique that proves useful for dimension reduction and characterization of variability in multivariate populations. Our interest lies in studying when and why PCA can be used to effectively model a response-predictor set relationship. Specifically, take Z to be a continuous random variable such that its support traverses the origin of a p-dimensional continuous space E. Let Y be a p-dimensional continuous random vector in E such that the supports of each component of Y traverse the origin of E, where Y also satisfies the property that its p components are pair wise orthogonal. Select uniformly in E any vector X of p continuous random variables traversing the origin. We prove that Y explains Z better than X in terms of the correlation. In particular, we prove that the principal components explain better a response variable than the original input variables. This has important consequences for modeling data in high dimensions. We illustrate this result using PRIM, a bump-hunting algorithm used to identify and characterize modal subgroups in populations. We study the empirical performance of our findings via simulations that mimic high dimensional applications.

This is joint work with J. Sunil Rao of the University of Miami and Jean-Eudes Dazard of Case Western Reserve University.