Skip to content

v0.5.0

Latest
Compare
Choose a tag to compare
@keithchev keithchev released this 10 May 16:22
· 4 commits to main since this release
b6dfc76

ProteinCartography v0.5.0

Overview

This release includes a number of minor improvements and also introduces a new organization for the output directories generated by the pipeline. Because snakemake is a file-based workflow engine, this change unfortunately means that this version of the pipeline is not compatible with previous versions. In other words, it will not be possible to re-run the new version of the pipeline with output directories that were initially generated by prior versions of the pipeline. Instead, it will be necessary to re-run the pipeline from scratch.

New features and improvements

  • Reorganize the directory of output files to improve clarity and more clearly distinguish the final outputs of the pipeline from intermediate outputs. (This is a breaking change; see above.)
  • Merge Snakefile_ff (the "cluster" mode of the pipeline) into the main Snakefile and add a config parameter to specify whether to run the pipeline in "search" or "cluster" mode.
  • Update and clarify some sections of the main README.
  • Add developer docs.

Fixes

  • Generate TM scores for each of the input proteins versus all of the query proteins (previously, some input-query protein pairs did not have a TM score due to Foldseek's filtering).
  • Fix a bug that may have prevented the pipeline from running when only input FASTA files (rather than PDBs) are provided.
  • Use unverified requests to query the ESMFold API as a work-around for ESMFold's expired SSL certs (from external contributor @naailkhan28).
  • Add integration tests for the "cluster" mode of the pipeline.