Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

units unit fix #157

Merged
merged 1 commit into from
Mar 9, 2023
Merged

units unit fix #157

merged 1 commit into from
Mar 9, 2023

Conversation

will-moore
Copy link
Member

Fixes #156

@github-actions
Copy link
Contributor

github-actions bot commented Nov 2, 2022

Automated Review URLs

@joshmoore
Copy link
Member

So just to clarify: we (1) generated invalid (from omero-cli-zarr) and (2) validated invalid datasets, but the spec never said that it was correct, right?

I'm trying to get my head around whether or not we need a major warning of any form (and/or be permissive of the plural for this version?!)

@will-moore
Copy link
Member Author

Not sure what you mean by:

but the spec never said that it was correct

Translating the spec into JSON schema (or vice versa) was a manual process, hence error-prone.

It's invalid for the "unit" to be a number (instead of a string) but it's not invalid to add a custom "units" value, since JSON schema allows custom keys. I think it's possible to set the schema to dis-allow any custom / unrecognised keys, but I think we've chosen not to do that because we want to allow extensibility or at least temp addition of custom metadata.
Even with the fix to "unit", we are still permissive of "units"!

I'm not sure we need a major warning. The most significant "bug" would be that if you used the "units" schema, then it wouldn't have flagged "unit": 100 as invalid. But that seems kinda edge-case scenario.

@joshmoore
Copy link
Member

Not sure what you mean by:

i.e. the typo was never in the spec, but as you say, only in the translation to the validator and omero-cli-zarr.

Even with the fix to "unit", we are still permissive of "units"!

With permissive I meant more that we would also recognize the plural as valid. (Would be quite the kludge)

The most significant "bug" would be that if you used the "units" schema, then it wouldn't have flagged "unit": 100 as invalid. But that seems kinda edge-case scenario.

For this repo, agreed. But do we have a feeling how many filesets are out there? Can the validator warn about the particular use case of "units"? i.e. (softly) disallow it to help people out?

@sbesson
Copy link
Member

sbesson commented Nov 3, 2022

My understanding is that the generated datasets are syntactically valid since unit is not mandatory but a recommended property of the axes specification and custom keys like units are not forbidden by the spec

Despite their validity, the datasets we generated did not represent what was intending as the unit metadata was effectively missing. Additionally, the published schema were incorrectly validating the units key (if it existed) and ignoring the unit key (if it existed).

I feel that supporting both forms is a lot of work and likely overkill esp. without having any idea of the impact (so far only omero-cli-zarr to the best of our knowledge would generate dataset using the plural form). Including some tooling/extending the validator to warn of the presence of units might be a good idea.

@will-moore
Copy link
Member Author

I've manually fixed "units" -> "unit" in the 'done' samples above, with e.g.:

mc cp uk1s3/idr/zarr/v0.4/idr0050A/4995115.zarr/.zattrs 4995115.zattrs
vi 4995115.zattrs 
mc cp 4995115.zattrs uk1s3/idr/zarr/v0.4/idr0050A/4995115.zarr/.zattrs

but this isn't going to scale well for those 2 plates.
Any tips on how to script/automate this?

@fm3
Copy link

fm3 commented Nov 15, 2022

Thanks for looking into this!

FYI, we just ran into units again in https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0062A/6001240.zarr/labels/0/.zattrs – the dataset has a checkmark in the above list, but apparently the labels .zattrs file was not yet adapted?

@will-moore
Copy link
Member Author

@fm3 Thanks for picking that up. Fixed units -> unit in https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0062A/6001240.zarr/labels/0/ just now.

@will-moore
Copy link
Member Author

To fix units -> unit in plate...

Download just the .zattrs locally...

aws s3 sync --no-sign-request  --endpoint-url https://uk1s3.embassy.ebi.ac.uk --exclude '*' --include "*.zattrs" s3://idr/zarr/v0.4/idr0001A/2551.zarr .

Replaced all "units" with "unit" (1724 occurrences across 862 files)

Tried to re-upload with aws but no joy (dont' have mc installed locally):

$ aws s3 cp .zattrs s3://idr/zarr/v0.4/idr0001A/2551.zarr/.zattrs
upload failed: ./.zattrs to s3://idr/zarr/v0.4/idr0001A/2551.zarr/.zattrs An error occurred (InvalidAccessKeyId) when calling the PutObject operation: The AWS Access Key Id you provided does not exist in our records.

Instead, copied to where I do have mc...

$ rsync -rvP --progress ./ ome-zarr-dev1.openmicroscopy.org:/lifesci/groups/jrs/wmoore/idr0001/2551.zarr

And then from there uploaded as before...

$ mc cp -r 2551.zarr/ uk1s3/idr/zarr/v0.4/idr0001A/2551.zarr/ 

@will-moore
Copy link
Member Author

Repeated with plate: idr0072B/9512.zarr: 2880 occurrences replaced in 1440 files.

rsync -rvP --progress ./ ome-zarr-dev1.openmicroscopy.org:/lifesci/groups/jrs/wmoore/idr0072/9512.zarr

etc.

Tested that all wells and images in plate are valid.
Checked for "unit" in images. Eg. https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0072B/9512.zarr/I/13/13/

@will-moore
Copy link
Member Author

@joshmoore All samples above are fixed "units" -> "unit" now.
Good to merge and get this fix in?

@will-moore
Copy link
Member Author

Any outstanding fixes required here?
Would be good to get this in.

@will-moore will-moore mentioned this pull request Mar 2, 2023
@will-moore
Copy link
Member Author

@joshmoore - Good to get this fix in? - since this fixes an existing bug with the live schemas.
Then we can consider #168 as a potential follow-up?

@joshmoore joshmoore merged commit 5600401 into ome:main Mar 9, 2023
github-actions bot added a commit that referenced this pull request Mar 9, 2023
units unit fix

SHA: 5600401
Reason: push, by @joshmoore

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit to d-v-b/ngff that referenced this pull request Mar 10, 2023
…_fix

units unit fix

SHA: 5600401
Reason: push, by @d-v-b

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

units instead of unit
4 participants