-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
14 changed files
with
463 additions
and
749 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
# ARG | ||
Under construction |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,22 @@ | ||
# Data analysis overview | ||
The raw input data is in the form of short sequences of length around 100 bp. | ||
Prior to the actual data analyis, it is generally advised to clean and trim the reads, but these are rather mechanical, fairly easy to do, and not really interesting. We will skip ahead to the more interesting parts of the analysi workflow. | ||
Prior to the actual data analyis, it is generally advised to clean and trim the reads, but these are rather mechanical, fairly easy to do, and not really interesting. We will skip ahead to the more interesting parts of the analysis workflow. | ||
|
||
The workflow, of course, depends on the research objectives. Some of the typical questions that researchers try to answer are: | ||
|
||
## Assembly-free | ||
1. What are the different species of bacteria that are present in the sample? | ||
2. In what relative abundance are they present? | ||
3. Are there differences in the bacterial community profiles among sites ? | ||
4. What kind of antibiotic resistant genes are present? In what proportion? | ||
5. What kind of mobile genetic elements are present? In what proportion? | ||
|
||
|
||
## Assembly-based approach | ||
One approach is to first **assemble** the reads, which means to stitch the reads together to reconstruct the chromosome which was fragmented during the sequencing process. | ||
This is a challenging task and is computationally demanding. | ||
We will not delve into this approach in this workshop. | ||
|
||
|
||
## Assembly-free | ||
What we will explore is the assembly-free approach, in which the reads are compared against reference databases to answer the questions posed above. In particular, we will look at Questions 1 -- 3. | ||
|
||
## Assembly-based | ||
$$ | ||
x | ||
$$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,27 @@ | ||
# Data generation process | ||
# Metagenomics and the data generation process | ||
|
||
## What is metagenomics? | ||
For the purposes of this workshop, we define metagenomics as the application of high-throughput sequencing to DNA extracted directly from environmental, uncultured samples. | ||
For example, later we will be looking at samples of microbial community found in hospital waste water. | ||
|
||
## From samples to sequences | ||
The figure below shows how we go from environment samples, in this case hospital wastewater, to seqeunce data. | ||
|
||
![metagenomics-sample-to-seq](imgs/metagenomics-sample-seq-slim-jpg.jpeg) | ||
|
||
Metagenomic sampling produces a lot of sequence data, which is the starting point of bioinformatics analysis. | ||
|
||
|
||
## How the statistician sees it | ||
Metagenomics data is compositional data. | ||
Most analysis will be based on relative abundance. | ||
|
||
There is meaning only in the relative abundances observed in the sample. | ||
|
||
![compositional-data](imgs/compositional.jpeg) | ||
|
||
## Discussion | ||
Are you planning to use metagenomics for your study? | ||
What is the experiment design? | ||
What is the experiment design? | ||
|
||
## Further reading | ||
- Gkiir et al, [Microbiome Datasets Are Compositional: And This is Not Optional](https://doi.org/10.3389/fmicb.2017.02224), Front. Microbiol.,2017. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.