-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sidecar file naming and format #520
Comments
With an interest in retaining compliance with the 1.0 specification, one approach could be to create an object or storage root extension that defines the digestAlgorithm used. This would facilitate direct access to any given inventory digest file. |
It would certainly seem reasonable to have a storage root level statement (ie. an extension) that says "every object will use sha512 digests/sidecars" or at least "the latest version of every object will use sha512 digests/sidecars" -- this would essentially turn any occurrence of something else into a local error |
having a storage root or object extension that defines the digestAlgorithm sounds fine (actually I might lean toward a storage root extension - not sure it makes much sense have an object extension that you load to find out the algorithm that you can find in other places in the object) |
I am marking this as a 2.0 issue, with the 1.0 recommendation of defining an extension that details which algorithm is used in order to directly know the name of the inventory file. |
@pwinckles is this something that you still need addressed, or are you happy with the way things are. |
@rosy1280 you can close |
The specification for the inventory sidecar is awkward to use because you often want to verify the integrity of an inventory file before deserializing it, but this is complicated by the fact that the name of the sidecar file is dependent on the digest algorithm that's defined within the inventory file.
On filesystem implementations, this is annoying but not a big deal. You can just list the files and examine their names to identify the sidecar. However, the problem is more annoying for object store implementations to resolve.
It seems to me that the sidecar file specification was based on the BagIt manifest specification, but it does not seem like a good fit.
The ship may have sailed on this one, but, to me, it makes more sense if the sidecar MUST be named
inventory.json.sidecar
(or whatever better name you can come up with), and have contents likeALGORITHM\tDIGEST
. WhereALGORITHM
MUST be the same as the algorithm that's defined in the inventory.This would allow the sidecar to be easily located without needing to deserialize the inventory or root around looking for it.
[Edit] Reflecting on it more, I see that the format does align with how checksums are usually stored on unix systems. It's easy for a person to use manually. It's just more complicated to use programmatically.
The text was updated successfully, but these errors were encountered: