Cross-study Performance of Predictions, with Application to Genomics

    Giovanni Parmigiani
    Numerous gene signatures of patient prognosis for late-stage, high-grade ovarian cancer have been published, but diverse data and methods have made these difficult to compare objectively. However, the corresponding large volume of publicly available expression data creates an opportunity to validate previous findings and to develop more robust signatures. We thus built a database of uniformly processed and curated public ovarian cancer microarray data and clinical annotations, and re-implemented and validated 14 prognostic signatures published between 2007 and 2012. In this lecture I will describe the methodology and tools we developed for evaluating published signatures in this context. I will also use this application as the springboard for a more general discussion on how to evaluate statistical learning methods based on a collection of related studies.<

