phenotypes_plot

Scripts for making nice plots of cohorts of patients based on HPO terms

Examples

A small random subsample of an in-house IMGAG database was used to generate these plots.

Grey dots = diseases from OMIM or ORPHA ( http://purl.obolibrary.org/obo/hp/hpoa/genes_to_phenotype.txt ). Colored dots = patients from your cohort. 60 clusters k-means is used to generate labels (mainly for illustration puproses) - you can put your own labels on the plot without labels.

Cohort plot based on predefined disease group annotation

Disease groups annotated manually by submitters.

With cluster labels

Without cluster labels

Cohort plot based on automatically generated with HPO terms disease group annotation

With cluster labels

Groups defined by 3 HPO terms (Ataxia, Spasticity, Dystonia):

Without cluster labels

What do you need

Install the following packages for you R:

install.packages("ontologyIndex")
install.packages("ontologySimilarity")
install.packages("Rcpp")
install.packages("sets")
install.packages("ggplot2")
install.packages("umap")
install.packages("ggrepel")

Download Human Phenotype Ontology obo file: https://raw.githubusercontent.com/obophenotype/human-phenotype-ontology/master/hp.obo .

Download genes_to_phenotype from https://hpo.jax.org/app/download/annotation .

You should have a text file with your cohort, with one column denoting unique sample IDs, and another column providing HPOs per patient, as "HP:0000238 - Hydrocephalus; HP:0000505 - Visual impairment". If your HPOs are provided differently, you have to modify the following code:

  hpo_numbers = data_for_plotting[which(data_for_plotting$X.name == name), index_of_hpos]
  hpos_list <- strsplit(hpo_numbers, "; ")[[1]]
  hpo_numbers = c()
  for (elem in hpos_list) {
    hpo_num = strsplit(elem, " - ")[[1]][1]
    if (hpo_num %in% hpo$id) {
      hpo_numbers = c(hpo_numbers, hpo_num)
    }
  }

How to run

Modify your plot_cohort_within_omim.R file (the CONFIG block, before the CODE block) and put your paths there. Specify which column name indicates sample names (index_of_IDs = which(colnames(data_for_plotting) == "X.name")) and which column name is for HPO terms (index_of_hpos).

If you have manually annotated cathegories (shown as color dots on the final plot), comment disease_cathegory_doc and use function annotate_cathegory_manually. Otherwise, you need to run create_hpo_list_for_annotation.R prior to plot_cohort_within_omim.R.

In create_hpo_list_for_annotation.R you need to specify HPOs which define your disease group (e.g., example, ataxia, spasticity and dystonia would be set_of_branches = c("HP:0001257", "HP:0001332", "HP:0001251").

Then you generate HPOs.disease.tsv file - use it in plot_cohort_within_omim.R as disease_cathegory_doc.

Then simply run plot_cohort_within_omim.R as Rscript plot_cohort_within_omim.R or line by line in R!

Since the plot is based on all OMIM diseases, it takes several (2-4 hours) to produce the plots. It makes sense to execute the code line by line.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
images		images
LICENSE		LICENSE
README.md		README.md
create_hpo_list_for_annotation.R		create_hpo_list_for_annotation.R
plot_cohort_within_omim.R		plot_cohort_within_omim.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

phenotypes_plot

Examples

Cohort plot based on predefined disease group annotation

With cluster labels

Without cluster labels

Cohort plot based on automatically generated with HPO terms disease group annotation

With cluster labels

Without cluster labels

What do you need

How to run

About

Releases

Packages

Languages

License

GermanDemidov/phenotypes_plot

Folders and files

Latest commit

History

Repository files navigation

phenotypes_plot

Examples

Cohort plot based on predefined disease group annotation

With cluster labels

Without cluster labels

Cohort plot based on automatically generated with HPO terms disease group annotation

With cluster labels

Without cluster labels

What do you need

How to run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages