Detecting and Controlling for Biased Sampling in Case/Control PRS Studies

Abstract

In case control studies, samples are usually ascertained in a non-random way, where the sample prevalence might not represent the population prevalence of the trait, leading to biased estimate of the coefficient of determination (R2). Previous studies have proposed an R2 measure on the liability scale that can provide an accurate R2 for binary traits when the samples are ascertained, with the condition that both cases and controls are randomly sampled from their respective population. Biased in case control sampling, for example, cases obtained from hospitalized samples, or from volunteer (healthier samples), might lead to biased estimation of the phenotypic variance. It would be desirable to estimate the case severity if possible. Here, we propose a novel method for the detection of the direction of biased case selection and adjustment of the R2 estimate in polygenic score analyses. Using simulation and analytical analyses, we demonstrate that the Pearson Aitken Selection algorithm can be used to estimate an un-biased R2 and that the polygenic score distribution can be used to infer the case severity. Our algorithm estimated case severity with an accuracy of 93.7% and 61.9% for volunteer cases and hospitalized cases respectively. Our algorithm also consistently out-performed existing methods in providing an adjusted R2 when cases were non-randomly sampled from the population. In this study, we provide a novel approach for detecting bias in case sampling and we also provide an adjustment to the phenotypic variance explained (R2) in PRS analyses.

Publication
In World Congress of Psychiatric Genetics
Avatar
Shing Wan Choi
Postdoctoral Fellow

I am a Postdoctoral Fellow working under Dr Paul F. O’Reilly at the Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai.