WikiORA - Gene Over-Representation Analysis

WikiORA is a tool designed to simplify the process of gene set over-representation analysis by integrating data from Wikidata and Wikipedia. Our aim is to provide an easy-to-use platform for researchers to identify significantly enriched gene sets in their data, using a combination of curated gene sets from various sources.

The tool provides a learning dimension during exploratory analysis, aiding bioinformaticians in making sense out of gene lists.

Features

Integrates information from Wikidata, which has been enriched with Wikipedia, Gene Ontology, and PanglaoDB
Supports human and mouse gene sets
Provides over-representation analysis with hypergeometric test and Bonferroni correction
Interactive results with links to Wikipedia pages for enriched terms
Downloadable gene set files (GMT format)

How It Works

WikiORA uses the following steps to perform over-representation analysis:

Input: A list of genes is provided by the user.
Background Gene Sets: The background gene sets are defined using data curated into Wikidata.
Overlap Calculation: For each gene set, the overlap between the user-provided gene list and the genes associated with the gene set (and its Wikipedia page) is calculated.
p-value Calculation: The p-value is calculated using the hypergeometric test, representing the probability of observing at least as many overlapping genes by chance.
Correction: The Bonferroni correction is applied to account for multiple testing and adjust the p-values.
Results: Results are sorted by p-value to highlight the most significantly over-represented terms.

Data Sources

WikiORA uses Wikidata as a data source for the gene sets. It combines community curation with imports from sources such as:

Citing

(Manuscript in preparation)

While gene sets include more information than the original data sources, when using the cell type marker data, we recommend also citing PanglaoDB. For gene ontology gene sets, we recommend citing also the Gene Ontology Annotation (GOA) Database and the Gene Ontology Resource.

Team

WikiORA is developed in Brazil by a team of bioinformaticians passionate about open knowledge. The project is led by Tiago Lubiana at the Computational Systems Biology Laboratory, headed by Prof. Helder Nakaya.

Contact Us

If you have any questions, feedback, or suggestions, please feel free to contact us via GitHub.

Usage and installation

WikiORA is available as a web-server at https://wikiora.sysbio.tools.

To run WikiORA locally, clone the repository and install the required dependencies:

git clone https://github.com/lubianat/wikiora.git
cd wikiora/www/python/src
pip install -r requirements.txt

Start the local server:

flask run

Open your web browser and go to http://127.0.0.1:5000 to access WikiORA.

Hosting

This project is hosted on Toolforge at wikiora.toolforge.org.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

Made in 🇧🇷 in 2024. Content from Wikidata is under CC0.

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
.github/workflows		.github/workflows
www/python/src		www/python/src
.bumpversion.cfg		.bumpversion.cfg
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WikiORA - Gene Over-Representation Analysis

Features

How It Works

Data Sources

Citing

Team

Contact Us

Usage and installation

Hosting

License

Acknowledgements

About

Releases

Packages

Contributors 3

Languages

License

csbl-br/wikiora

Folders and files

Latest commit

History

Repository files navigation

WikiORA - Gene Over-Representation Analysis

Features

How It Works

Data Sources

Citing

Team

Contact Us

Usage and installation

Hosting

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages