Skip to content

Latest commit

 

History

History
139 lines (106 loc) · 6.6 KB

README.md

File metadata and controls

139 lines (106 loc) · 6.6 KB

Day 3

Time Activity Slides Hands-on
Morning Metagenome assembly Link here Link here
Afternoon Assembly QC Link here

Metagenome assembly

First log in to our cloud instance with the IP provided and cd to your working directory.
Let's then pull possible changes in the Github repository:

cd physalia_metagenomics
git pull origin main

After that we're gonna go through the metagenomic assembly part, but not run the actual assembly script.
The assembly takes days and needs more resources than we have on our instance.
So the assemblies will be provided.

Short-read assembly with megahit

The short reads will be assembled using megahit.
Although, we won't be running the actual assembly, megahit is installed on our instance.

So have a look at the different options you can change in the assembly.
You can find more information about megahit from the megahit wiki.
You don't need to understand each and every option, but some of them can be important.

conda activate assembly_env
megahit -h

Questions about megahit

  1. What do you think would be important? What would you change or set?
  2. What version of megahit have we installed? Is it the latest?

After that have a look at the assembly script Scripts/MEGAHIT.sh.
Open it with a text editor or print it on the screen with less.

Would you have changed something else and why?

When we're satisfied with the assembly options, we would start the assembly and wait from few hours to several days depending on your data and computational resources.
But we won't do it, since we don't have to time or the resources.
Instead, you can use the assemblies and log files we have made and make soft link as before to your own folder.
We have removed some intermediate files, so the folder contains only some of the files megahit normally produces.
But the most important is the final.contigs.fa which cointains the final contigs as one might expect.

cd ~/physalia_metagenomics
ln -s ~/Share/ASSEMBLY_MEGAHIT/ ./

Inside the folder you'll find the assembly logs inside the assembly folder for each sample.
Start by looking at the assembly logs with less.

Questions about the assembly

  1. Which version of megahit did we actually use for the assemblies?
  2. How long did t