Changelog

All notable changes to patteRNA are documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[2.1] - 2021-10-20

Changes

Improved documentation [PR]
Improved testing [PR]

Fixed

Improved logging when using reference structures [PR]

[2.1-beta] - 2021-08-29

Changes

Stable implementation of logistic scoring classifier (can be turned off with --no-vienna or --no-cscores flags) [PR]
- Added new dependency to ViennaRNA Python interface
Added --nan flag to interpret missing data and --print-nan flag to print NaN scores [PR]
Improved documentation [PR]

Fixed

Outputs for a transcript are computed and written all at once instead of on-the-fly during parallel processing, speeding up scoring [PR]

[2.0] - 2021-08-24

Changed

Re-implemented --reference flag to natively work with new training pipeline [PR]
Code optimization and cleanup

[2.0.0-beta] - 2021-07-11

Changed

Fixed bug in scoring phase that rendered c-score compilation non-deterministic. [PR]
Addition of --SPP flag to compute smoothed P(paired) as a structure profile. [PR]
Improved documentation [PR]

[2.0.0-alpha] - 2021-02-23

Changed

Major version release with almost a full rewrite of the method. [PR]
Addition of --HDSL flag to compute local structure levels. [PR]
Addition of a new Discretized Observation Model (DOM) emission model scheme, which is more precise for scoring and faster than a GMM. [PR]
New c-score distribution sampling procedure is much more efficient than before. [PR]
Now using human-readable .json format for saving a loading trained models. [PR]
Matplotlib backend to svg. [PR]
Dependencies: humanfriendly. [PR]

[1.2.2] - 2020-01-08

Changed

Updated sample data with corrected structures. [PR]
Updated README to reflect current developers. [PR]
Fixed sample data. [PR]

[1.2.1] - 2019-04-10

Changed

Installation procedure. [ML]
Matplotlib backend to Agg. [ML]

[1.2.0] - 2018-06-11

Supervised initialization of Model's parameters based on reference RNA secondary structures in dot-bracket notation supplied via the new --reference flag. Note that --reference supports RNAstructure's ct2dot output format. [ML]
Simulation framework for testing (devs only). [ML]
Scoring motifs now returns, by default, a c-score based on a fitted Null distribution in addition to the original score. [PR/ML]
Flag --no-cscores to turn off the computation of c-scores. [ML]
The training set is now built automatically using KL divergence metrics (via option --KL-div). Data-dense transcripts are prioritized. [ML]
Infinite values in structure profiles are now supported. [ML]
Added a checkpoint to ensure paired/unpaired states are never flipped in the model. [ML]
Dependencies: cairosvg (needed by pygal). [ML]
Dependencies: matplotlib. [ML]

Changed

Motifs were all observations are missing now return NaN scores. [ML]
Progress bar during scoring tracks individual transcripts instead of batches. [PR]
The behavior of the --min-density/-d CLI flag was changed to affect both training and scoring. [ML]
The default value for the --min-density/-d CLI flag was changed to 0 (i.e. all transcripts are used by default). [ML]
Now renders all plots as PNG instead of SVG. [ML]
Re-vamped user messages printed during the task. [ML]

Removed

Removed -n CLI flag (obsolete). [ML]
Removed --filter-test CLI flag (obsolete). [ML]

[1.1.4] - 2018-02-13

Fixed

Bugfix. Sequence constraints contained a bug affecting non fully nested target motifs. [PR]

[1.1.3] - 2018-02-13

Fixed

Bugfix. Output Viterbi and posterior files were not deleted if already existing. [ML]

[1.1.2] - 2018-02-06

Changed

Unsupervised initialization now uses by default an initial transition probability matrix derived from the Weeks set and GMM means based on data percentiles for increased robustness. [ML]

Fixed

Bugfix over v1.1.1 which was removed [ML]

[1.1.0] - 2018-01-18

Added

The number of Gaussian components (-k) can be determined automatically using Aikaike Information Criteria (AICs). [ML]
Automatic detection of the experimental assay based on the input observation filename extension. [ML]
Motifs are sorted by scores in scores.txt. [ML]
Dependencies: scikit-learn and scipy (needed by scikit-learn). [ML]
Motif dot-brackets are declared using the option --motif and by default sequence constraints are applied. [ML]
Pairing state sequences are declared using the option --path and used either alone or as a mask to --motif. [ML]

Changed

Structure profiling observations were removed from the output score file scores.txt to minimize file size. [ML]
Versioned the latest tested patteRNA distributions to latest. [ML]
Refactored file_handler.py to handle all FASTA-like files within a single function. [ML]
Option --gammas is now named --posteriors. [ML]
Option --nogmm is now devs only. [ML]
Options --pattern and -s/--seq are now legacy and will be deprecated in the future. [ML]

Removed

Removed parameter for final state probabilities rho (obsolete). [ML]
Removed -wmin as it is now obsolete. [ML]
Removed --PARS CLI flag as file extensions are used to determine the assay (obsolete). [ML]
Removed --debug CLI flag (obsolete). [ML]

Fixed

Fixed a bug where the last entry of input files would not be read. [ML]
Minor bugfixes. [ML]
Minor runtime optimizations. [ML]

[1.0.0] - 2017-12-12

Added

Initial release. [ML]