A course of the danish healt data science sandbox
This course is based on the material developed for the NGS summer school at Aarhus University. The material is organized in four separated jupyter notebooks in both bash
, python
and R
where you will benefit of an interactive coding setup.
If you use any of this material for your research, please cite this course with the DOI below, and acknowledge the Health Data Science Sandbox project of the Novo Nordisk Foundation (grant number NNF20OC0063268). It is of great help to support the project.
After the course, you will have knowledge of bioinformatics methods for analyzing genomes using NGS
data, including knowledge of the existing types of genome data, how the different types of data can be displayed and analyzed, the current methods for genome assembly and analysis, their accuracy and how they can be used. The course will enable you to devise and run a project that makes use of NGS data.
This is an introductory course that needs a basic understanding of the biology behind sequencing, and not necessarily programming experience.
- Describe key challenges in the analysis of NGS data
- Explain the theoretical foundation for methods that use NGS for assembly and analysis of genomes
- Discuss the bioinformatic methods for genome analysis and hypothesize what drives the outcome of the methods
- Discuss original literature within the subjects and relate the discussed topics to analysis scenarios
- Apply bioinformatics tools within the selected application areas and reflect on the results, formulating your own conclusion in the proposed tasks
- jupyter notebooks for interactive coding
- lecture slides from the instructor
You can find the links to the material in the table at the bottom of this page.
This course was originally one-week long.
Heads of the course: Mikkel H. Schierup, Stig U. Andersen.
Exercise responsibles: Lavinia I. Fechete, Jilong Ma, Samuele Soraggi.
Contact: Samuele Soraggi (samuele at birc.au.dk).
Here you find a table with the instructor's slides and a link to the compiled notebooks, that you can also run on your own following the instructions. Data alignment can also be performed on the Galaxy
interactive webpage (see the manual in the table).
Topic | Slide | Notebook |
---|---|---|
Sequencing technologies | link | -- |
Mapping to reference | link | Notebook or Galaxy guide |
Data visualization | link | -- |
SNPs and structural variants | link | Notebook |
RNA sequencing | link | Notebook |
De-novo assembly | link | -- |
Microbiomes and metagenomics | link | -- |
Single cell RNA sequencing | link | Notebook |