You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After running analysis with --gvcf option on a 50 Gb BAM file containing 4 ONT runs and HG19 reference, the resulting tmp output subfolder takes 419 Gb, plus 117 Gb in the main output folder. Probably, it would make sense to remove VCF partial files after concatenating and sorting them and compress the output. For instance, a 117 Gb GVCF file takes only 8.5 Gb when bzip2-compressed. Some libraries as lbzip2 can decompress it in parallel. Perhaps you want to minimize dependencies, but disk space efficiency is also important when it comes to renting servers with fast SSDs.
In the next release, we will 1) compress the intermediate files for GVCF output, and 2) provide an option for users to delete intermediate files immediately after no longer needed.
After running analysis with
--gvcf
option on a 50 Gb BAM file containing 4 ONT runs and HG19 reference, the resultingtmp
output subfolder takes 419 Gb, plus 117 Gb in the main output folder. Probably, it would make sense to remove VCF partial files after concatenating and sorting them and compress the output. For instance, a 117 Gb GVCF file takes only 8.5 Gb when bzip2-compressed. Some libraries as lbzip2 can decompress it in parallel. Perhaps you want to minimize dependencies, but disk space efficiency is also important when it comes to renting servers with fast SSDs.The text was updated successfully, but these errors were encountered: