You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think there is a minor bug in (hopefully just) logging when using a very small genome.
I have a 3326bp reference (it is Genbank MK628543.1) and when I run this command: STAR --runThreadN 4 --runMode genomeGenerate --genomeDir STAR --genomeFastaFiles reference.fasta --genomeSAindexNbases 5
Similarly, if I run without the genomeSAindexNbases parameter I get this warning:
WARNING: --genomeSAindexNbases 14 is too large for the genome size=262144, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 8
I think this error is coming from the "genomeChrBinNbits" default value of 18 (as this is 2^18) and something to do with the "pad the chromosomes to bins boundaries" on lines 49-51 of genomeScanFastaFiles.cpp - but presumably even if this number is used internally it isn't the number which should be logged and also used to calculate the genomeSAindexNbases recommendation?
This number (262144) also appears in the chrStart.txt file.
I'm running STAR 2.7.5a (build 0 from bioconda) on Ubuntu. My Log.out (without setting
genomeSAindexNbases is attached)
the "size" that's reported is indeed just 2^18, you can actually reduce it by changing it to 2^12=4096 with --genomeChrBinNbits 12. This will be enough to contain genome 3326 bases long, and will reduce RAM usage.
You do not need to worry about this number - it's just used internally, and it does affect the output.
You can check the chromosomes' lengths in the Log.out file of the mapping stage, after this line:
"Number of real (reference) chromosomes".
They should be exactly as you expect them.
Hi Alex
whats your idea about this:
specifies the number of threads to use for creating the genome index.
It was calculated according to this formulla:
min(14, log2(ReferenceLength)/2 – 1)
for example the, in Arabidopsis, reference genome size is 154478 bases, calculating the formula gives:
min(14, log2(154478)/2 - 1) ≈ 6
Hi Alex,
I think there is a minor bug in (hopefully just) logging when using a very small genome.
I have a 3326bp reference (it is Genbank MK628543.1) and when I run this command:
STAR --runThreadN 4 --runMode genomeGenerate --genomeDir STAR --genomeFastaFiles reference.fasta --genomeSAindexNbases 5
My Log.out has the following:
Similarly, if I run without the genomeSAindexNbases parameter I get this warning:
I think this error is coming from the "genomeChrBinNbits" default value of 18 (as this is 2^18) and something to do with the "pad the chromosomes to bins boundaries" on lines 49-51 of
genomeScanFastaFiles.cpp
- but presumably even if this number is used internally it isn't the number which should be logged and also used to calculate the genomeSAindexNbases recommendation?This number (262144) also appears in the chrStart.txt file.
I'm running STAR 2.7.5a (build 0 from bioconda) on Ubuntu. My Log.out (without setting
genomeSAindexNbases is attached)
Log.out.txt
@GeegC
Thanks very much,
Katy
The text was updated successfully, but these errors were encountered: