Skip to content

Releases: TyberiusPrime/pypipegraph2

Performance release

25 Nov 12:58
Compare
Choose a tag to compare

We had an uncessary busy loop,
this has been fixed and should free up a core again.

Bugfix

22 Nov 11:09
Compare
Choose a tag to compare

This release resolves a rare 'don't know what to do next'
bug in the engine.

.sha256 in addition to 'normal' file hashes

21 Nov 07:49
Compare
Choose a tag to compare

We new support jobs creating
.sha256 files containing 64 char hexdigest (presumably sha256 hashes)
next to their output files.

These are then used instead of recalculating a hash using xxhash_128().

This allows for both speed ups (for example tools like mbf_fastq_parser
can generate such hash files essentially for free),
and for file hashes that are not strictly a hash of the bytes.

(For example the STAR aligner will generate different compressed files,
that are nevertheless identical in uncompressed content).

Signal reentrant safety

07 Nov 10:50
Compare
Choose a tag to compare

There was a change to get a deadlock if you hit ctrl-c at the wrong time.
This should fix that.

ExternalJobs, DictEntryLoadingJobs and fork safety

04 Nov 15:19
Compare
Choose a tag to compare

There is a fancy documentation now: https://tyberiusprime.github.io/pypipegraph2/

This release brings three new jobs:

DictEntryLoadingJob and CachedDictEntryLoadingJob),
which are essentially 'reskins' of AttributeLoadingJobs that store in ['key'] instead of .key.

and ExternalJob which is used to run external CLI programs.

This release also makes numba use TBB (if numba can be imported), which is supposedly thread and more importantly fork safe. In addition, all forking now happens on the main thread, which should help with overall fork safety.

(Unfortunately, forking is never safe. If you hold locks in different threads across forks, you're going to deadlock.
But then, forking is the only way to get decent multi-core out of python, when spawn() + importing python modules runs in the second-range).

There's also a fix to job-deduplication which was handing out too many 'new' objects.

A CLI to hash files just like the ppg2 was added.

A signal storm job when a well connected upstream job failed has been fixed.

There is an experimental 'signal when a job was done' pipe included that can be used by pypipegraph2-interactive.

(pandas) DataFrame hashing will now work correctly, even if DeepDiff is older than version 8.0

The history is now (transparently) converted to zstd, making it much smaller on disk.

Minimum Python version has increased to 3.9.

History dumping can no longer be aborted by KeyboardInterrupt.

v3.1.4

16 Sep 10:00
Compare
Choose a tag to compare
  • Requires deepfix with pandas fixes
  • fix a lock issue in SharedMultiFileGeneratingJobs

v3.1.3

17 Jul 09:06
Compare
Choose a tag to compare

Reenables logging

v3.1.2

17 Jul 05:56
Compare
Choose a tag to compare

Fall back to pyo3 0.21 for compatibility with older rust
(rust < 1.79 will be again supported by 0.22.2, but not by 0.22.1)

iFix cargo.lock

16 Jul 09:32
Compare
Choose a tag to compare

Fixes cargo.lock, which had not been updated in 3.1.0.

Zstd history

16 Jul 09:26
Compare
Choose a tag to compare

This release silently upgrades you gziped history to zstd compressed one,
for great gains in the time it takes to save the history and smaller file sizes.