Home

Overview

This tutorial provides a step-by-step guide to performing basic polygenic risk score (PRS) analyses and accompanies our PRS Guide paper. The aim of this tutorial is to provide a simple introduction of PRS analyses to those new to PRS, while equipping existing users with a better understanding of the processes and implementation "underneath the hood" of popular PRS software.

The tutorial is separated into four main sections and reflects the structure of our guide paper: the first two sections on QC correspond to Section 2 of the paper and constitute a 'QC checklist' for PRS analyses, the third section on calculating PRS (here with examples using PLINK, PRSice-2, LDpred-2 and lassosum) corresponds to Section 3 of the paper, while the fourth section, which provides some examples of visualising PRS results, accompanies Section 4 of the paper.

  1. Quality Control (QC) of Base Data
  2. Quality Control (QC) of Target Data
  3. Calculating and analysing PRS
  4. Visualising PRS Results

We will be referring to our guide paper in each section and so you may find it helpful to have the paper open as you go through the tutorial.

Warning

Data used in this tutorial are simulated and intended for demonstration purposes only. The results from this tutorial will not reflect the true performance of different software.

Note

We assume you have basic knownledges on how to use the terminal, plink and R. If you are unfamiliar with any of those, you can refer to the following online resources:

Software link
terminal (OS X / Linux) 1, 2
terminal (Windows) 1, 2
plink v1.90, v1.07
R 1

Note

This tutorial is written for Linux and OS X operating systems. Windows users will need to change some commands accordingly.

Note

Throughout the tutorial you will see tabs above some of the code:

echo "Tab A"
echo "Tab B"

You can click on the tab to change to an alternative code (eg. to a different operation system)

Datasets

  1. Base data
  2. Target data: Simulated data based on the 1000 Genomes Project European samples

Requirements

To follow the tutorial, you will need the following programs installed:

  1. R (version 3.2.3+)
  2. PLINK 1.9

Citation

If you find this tutorial helpful for a publication, then please consider citing:

Citation

Choi, S.W., Mak, T.S. & O’Reilly, P.F. Tutorial: a guide to performing polygenic risk score analyses. Nat Protoc (2020). https://doi.org/10.1038/s41596-020-0353-1