Skip to content

V2.2.0

Compare
Choose a tag to compare
@cheny19 cheny19 released this 04 Dec 22:32
· 235 commits to master since this release
b3b8578

This version has been tested on Python 2.7 and Python 3.6 with the latest compatible packages repectively. In this release, we made a few big changes, and the pre-trained model profiles on our ftp site are not compatible anymore, but users are still welcome to use the fasta files for training. We will provide pre-trained models soon.

Major changes:

  1. Use Kernel Density Estimation (KDE) instead of Empirical cumulative density function (ECDF) to simulate the length distribution of reads (aligned and unaligned)
  2. Removed the bining strategy in simulating the align ratio on each reads, and the length distribution of simulated reads are more smooth
  3. Introduce --median_len and --sd_len options. Users can use these two options to control the median read length and the standard deviation, and the read lengths will follow lognormal distribution instead of the empirical length distribution from training reads

Note:

For ONT reads, the median length and mean length are quite different. The read length generally follow lognormal distribution, so please refer to wikipedia for details about these two parameters. The values are --median_len 5642 and --sd_len 1.015 for R9 1D reads, which is also roughly the same for other libraries.