Skip to content

An R data package containing PGS methods comparison data and results across five European biobanks

License

Notifications You must be signed in to change notification settings

intervene-EU-H2020/pgsCompaR

Repository files navigation

pgsCompaR

pgsCompaR is an R data package that contains performance metrics for polygenic risk score (PGS) development methods measured across five European biobanks.

This data package doesn't provide any helpful functions for comparing PGS. It only contains processed experimental data and documentation. The raw experimental data are also permissively licensed and publicly available, but are more difficult to work with.

Installation

The fastest way to install the development version of pgsCompaR using devtools:

devtools::install_github("intervene-EU-H2020/pgsCompaR")

This data package only depends on base R. You can download the built release and install it locally using install.packages() also.

Development dependencies

Development dependencies are required to run the scripts in data-raw/ that process the raw data and save the rda files in data/.

The simplest way to install the development dependencies is to use renv and restore the development profile:

$ git clone https://github.com/intervene-EU-H2020/pgsCompaR.git
$ cd pgsCompaR
$ R
R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
...
> renv::activate(profile="dev")
> renv::restore()

You may need to install renv first.

Example

There are four datasets exported by this package:

Dataset About
metrics Polygenic risk score performance metrics table for single biobanks
meta_res Equivalent to metrics, but meta-analysed
dst Pairwise comparison of polygenic risk score development methods
pv_mrg Equivalent to dst, but meta-analysed

Dataset documentation can be viewed in R the normal way:

library(pgsCompaR)
data(metrics)
?metrics

License

The data are licensed with CC-BY-4.0.

If you reuse data from this package in published work please cite our publication:

Remo Monti, Lisa Eick, Georgi Hudjashov, Kristi Läll, Stavroula Kanoni, Brooke N. Wolford, Benjamin Wingfield, Oliver Pain, Sophie Wharrie, Bradley Jermy, Aoife McMahon, Tuomo Hartonen, Henrike Heyne, Nina Mars, Samuel Lambert, Kristian Hveem, Michael Inouye, David A. van Heel, Reedik Mägi, Pekka Marttinen, Samuli Ripatti, Andrea Ganna, Christoph Lippert. "Evaluation of polygenic scoring methods in five biobanks shows larger variation between biobanks than methods and finds benefits of ensemble learning" The American Journal of Human Genetics 2024. doi: https://doi.org/10.1016/j.ajhg.2024.06.003

About

An R data package containing PGS methods comparison data and results across five European biobanks

Resources

License

Stars

Watchers

Forks

Languages