Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to cite extensions #387

Closed
pwinckles opened this issue Oct 3, 2019 · 13 comments · Fixed by #388
Closed

How to cite extensions #387

pwinckles opened this issue Oct 3, 2019 · 13 comments · Fixed by #388

Comments

@pwinckles
Copy link

Yesterday, language was added to the spec that introduces "cite-able" extensions. ocfl_layout.json can be used to cite extensions related to storage layout. Where are other types of extensions intended to be cited? For example, where would I cite the use of sha-512/256? Are all extensions intended to be cited in ocfl_layout.json?

@ahankinson
Copy link
Contributor

sha-512 does not need to be cited as part of an extensions, since it is part of the spec. If you wanted to use, for example, blake2s-64 for fixity, however, you would need to be able to refer to this somehow, which is what the cite-ability means.

@pwinckles
Copy link
Author

sha-512/256 (truncated sha-512) isn't part of the spec.

you would need to be able to refer to this somehow, which is what the cite-ability means.

That is exactly what I'm asking for clarity on. I don't see the point of citing something if no one knows where to look for the citation.

@ahankinson
Copy link
Contributor

Sorry, I thought you meant "sha-512 OR sha-256" but were using shorthand.

Do you mean where would the extensions be published? https://ocfl.github.io/extensions/

@pwinckles
Copy link
Author

Let's say that I have an OCFL repository that adheres to non-normative extensions that are published in https://ocfl.github.io/extensions/, where do I cite them in the repository to let others know what extensions I'm using? For extensions related to storage layout, I can cite them in ocfl_layout.json.

@birkland
Copy link
Contributor

birkland commented Oct 3, 2019

I think there is a bit of nuance here. I think what @pwinckles is getting at here is less "this is the definition of sha512/256 and the hash ID that can be used in OCFL", and more "this repository leverages the extension sha512/256". In other words, is there a general mechanism for advertising that a repository uses a certain set of extensions? ocfl_layout.json can be used for citing a particular kind of extension by URI as it pertains to layout, and can signal to clients that they need to use the layout defined in the extension.

So could there be a way to indicate "this repository uses sha512/256 as defined in this extension", and "clients should use sha512/256 instead of sha512 when they create new objects"

@ahankinson
Copy link
Contributor

I think the intention was that you should be able to use URI values if you were using a non-normative extension mechanism. So you could use sha-512 as the key in a fixity section OR https://ocfl.github.io/extensions/hash-algorithms#sha-512/256.

@pwinckles
Copy link
Author

Thanks, that's what was unclear to me.

@ahankinson
Copy link
Contributor

I can raise this with the editors today and see if they concur.

@pwinckles
Copy link
Author

pwinckles commented Oct 3, 2019

Is the intent that the digest algorithm that's used for content-addressing MUST either be sha512 or sha256?

The digest algorithm used for calculating digests within the OCFL Object. This SHOULD be sha512, however sha256 MAY also be used.

Can any of the other controlled values be used? How about extensions? If extensions can be used, how does this impact the inventory sidecar?

Every occurrence of an inventory file MUST have an accompanying sidecar file stating its digest. This sidecar file must be of the form inventory.json.ALGORITHM, where ALGORITHM is the chosen digest algorithm for the object. An example might be inventory.json.sha512.

@ahankinson
Copy link
Contributor

Is the intent that the digest algorithm that's used for content-addressing MUST either be sha512 or sha256?

Yes. No other hash algorithms can be used. This is for digestAlgorithm. For fixity, however, the table given and the extensions provide for virtually limitless hash algorithm choices.

This decision was made because OCFL relies on the digestAlgorithm to function, while fixity is an optional extra. Clients will then only need to implement two hash algorithms in order to be compliant.

@pwinckles
Copy link
Author

pwinckles commented Oct 3, 2019

Thanks for the clarification. I think it would be helpful if that MUST was also in the digest section rather than solely buried in the digestAlgorithm field description.

@ahankinson
Copy link
Contributor

It also removes the possibility that known-broken hash algorithms (CRC, MD5, SHA1) are used for content addressability. Should a collision occur in an OCFL Object, it would break content addressability which would leave the object in an unknown state. (Two files, different content, same hash = lost content).

@ahankinson
Copy link
Contributor

Discussion: Keys will be used instead of URIs. A table of extensions will be created by @zimeon that will give the values of the keys in the fixity section that can be used in addition to those that are available in the spec.

ahankinson added a commit that referenced this issue Oct 3, 2019
Adds further clarification to the text around what values for digest algorithms can be, and tightens up the language around the choice of digest algorithms.

Fixes #387
rosy1280 pushed a commit that referenced this issue Oct 3, 2019
* Formatting

* Fixed: Clarify specification of algorithms

Adds further clarification to the text around what values for digest algorithms can be, and tightens up the language around the choice of digest algorithms.

Fixes #387

* Fixed: Remove reference to SHA-256 paper

The reference to SHA-256 being more computationally intensive is not true on certain platforms or with certain chipsets, so it has been removed from the spec.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants