Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: support proper manifest creation with --relpath for sig check and sig collect #3054

Merged
merged 31 commits into from
Mar 8, 2024

Conversation

ctb
Copy link
Contributor

@ctb ctb commented Mar 1, 2024

This PR updates sig collect and sig check so that they can produce standalone manifests that work properly with default sourmash loading behavior. The default behavior produces broken manifests in some situations and is not changed, but will be deprecated in v5.

Details

Currently, sig collect and sig check default to producing standalone manifests with internal path locations relative to the current working directory. This conflicts with the default StandaloneManifest behavior implemented in save_load.py that loads path locations relative to the manifest location. As a result, whenever the manifest was in a subdirectory, the standalone manifests output by sig check and sig collect were broken. The only way to make good manifests in this situation was to use sig collect --abspath, but sig check didn't support --abspath, and using absolute paths is brittle in situations where you want to distribute manifests.

This PR adds --relpath to both sig check and sig collect, and adds --abspath to sig check. It also demonstrates the bad behavior in tests and annotates the tests appropriately.

See #3008 (comment) for more detailed discussion of why I think --relpath is the right behavior for the future.

  • adds --abspath and --relpath to sig check, to properly support relative paths;
  • adds --relpath to sig collect, to properly support relative paths;
  • documents this behavior properly for creating standalone manifests;
  • create issue to change default sig check and sig collect behavior for v4, and disable cwd behavior.

Techie TODO:

  • explicitly test relpath and abspath behavior in sig check;
  • explicitly test relpath behavior in sig collect
  • write some tests for sig check and sig collect to explore the relative path loading issue, with all three combinations of relpath: mf in cwd, sigs in subdir; mf in subdir, sigs in cwd; mf in subdir, sigs in subdir.

Related issues:

Copy link

codecov bot commented Mar 1, 2024

Codecov Report

Attention: Patch coverage is 97.50000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 86.79%. Comparing base (d578157) to head (1e3688a).

Files Patch % Lines
src/sourmash/sig/__main__.py 96.96% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           latest    #3054      +/-   ##
==========================================
+ Coverage   86.78%   86.79%   +0.01%     
==========================================
  Files         136      136              
  Lines       15524    15559      +35     
  Branches     2626     2635       +9     
==========================================
+ Hits        13472    13505      +33     
- Misses       1751     1752       +1     
- Partials      301      302       +1     
Flag Coverage Δ
hypothesis-py 25.46% <0.00%> (-0.08%) ⬇️
python 92.86% <97.50%> (+<0.01%) ⬆️
rust 61.21% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ctb ctb added the idea label Mar 3, 2024
@ctb ctb changed the title EXP: support better/different manifest relpath loading WIP: support proper manifest creation with --relpath for sig check and sig collect Mar 3, 2024
@ctb ctb changed the title WIP: support proper manifest creation with --relpath for sig check and sig collect MRG: support proper manifest creation with --relpath for sig check and sig collect Mar 3, 2024
@ctb
Copy link
Contributor Author

ctb commented Mar 3, 2024

Ready for review & merge @sourmash-bio/devs !

Copy link
Contributor

@bluegenes bluegenes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! This will definitely help with manifest utility 🎉

@ctb ctb merged commit a90ab01 into latest Mar 8, 2024
41 checks passed
@ctb ctb deleted the manifest_relpath branch March 8, 2024 23:34
@ctb ctb mentioned this pull request Mar 21, 2024
ctb added a commit that referenced this pull request Mar 21, 2024
# Release notes for sourmash v4.8.7

Note: This release changes the way `sourmash multigather` names output
files in some situations. Please see
#2722 for details.

Minor new features:

* support proper manifest creation with `--relpath` for `sig check` and
`sig collect` (#3054)
* fix `multigather` output by adding md5sum along with
`-U/--output-add-query-md5sum` (#2722)
* enable loading lineages from annotated gather with match_name instead
of name (#3078)

Bug fixes:

* fix output for `sketch ... --singleton` (#3066)
* fix `calculate_gather_stats` `threshold=0` bug (#3052)

Cleanup and documentation updates:

* adjust protein ksize for record/manifest (#3019)
* Resolve `sourmash gather --help` issue (#3032)
* rework the manifest documentation; do misc cleanup (#3027)
* add branchwater web to docs (#3018)

Developer updates:

* make core Manifest booleans python compatible (core) (#3007)
* safer ksize selection while still accommodating k=k*3 (#3028)
* fix clippy beta issues (#3088)
* tell dependabot to ignore upgrades to `byteorder`, `chrono`,
`once_cell`, and `wasm-bindgen` (#3065)
* update rust changelog for r0.13.0 in preparation for release (#3033)
* Allow changing storage location for a collection in RevIndex (#3015)
* Fix tox and nix configs so all tox tests execute correctly (#2992)
* Calculate all gather stats in rust; use for rocksdb gather (#2943)
* bump screed req to 1.1.3 (#3067)
* bump to v4.8.7-dev (#2989)

Dependabot updates:

* Bump DeterminateSystems/magic-nix-cache-action from 1 to 3 (#2994)
* Bump DeterminateSystems/magic-nix-cache-action from 3 to 4 (#3085)
* Bump DeterminateSystems/nix-installer-action from 4 to 9 (#2995)
* Bump DeterminateSystems/nix-installer-action from 9 to 10 (#3083)
* Bump chrono from 0.4.33 to 0.4.34 (#3000)
* Bump conda-incubator/setup-miniconda from 3.0.1 to 3.0.2 (#3046)
* Bump conda-incubator/setup-miniconda from 3.0.2 to 3.0.3 (#3057)
* Bump histogram from 0.9.0 to 0.9.1 (#3002)
* Bump itertools from 0.12.0 to 0.12.1 (#3043)
* Bump log from 0.4.20 to 0.4.21 (#3062)
* Bump num-iter from 0.1.43 to 0.1.44 (#2997)
* Bump pypa/cibuildwheel from 2.16.5 to 2.17.0 (#3084)
* Bump rayon from 1.8.1 to 1.9.0 (#3058)
* Bump roaring from 0.10.2 to 0.10.3 (#3014)
* Bump serde from 1.0.196 to 1.0.197 (#3045)
* Bump serde_json from 1.0.113 to 1.0.114 (#3044)
* Bump tempfile from 3.10.0 to 3.10.1 (#3059)
* Bump thiserror from 1.0.56 to 1.0.57 (#2999)
* Bump thiserror from 1.0.57 to 1.0.58 (#3082)
* Bump wasm-bindgen from 0.2.91 to 0.2.92 (#3060)
* Bump wasm-bindgen-test from 0.3.40 to 0.3.41 (#2996)
* Bump wasm-bindgen-test from 0.3.41 to 0.3.42 (#3063)
* Bump web-sys from 0.3.67 to 0.3.68 (#2998)
* Bump web-sys from 0.3.68 to 0.3.69 (#3061)
* Revert "Bump wasm-bindgen from 0.2.91 to 0.2.92 (#3060)" (#3064)
* Update asv to 0.6.2 (#3025)
* Update pytest requirement from <8.1.0,>=6.2.4 to >=6.2.4,<8.2.0
(#3075)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

relative path interpretation for manifests is a little fubar
2 participants