Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advice on ocfl_layout.json contents #351

Closed
birkland opened this issue May 17, 2019 · 9 comments
Closed

Advice on ocfl_layout.json contents #351

birkland opened this issue May 17, 2019 · 9 comments
Assignees
Milestone

Comments

@birkland
Copy link
Contributor

birkland commented May 17, 2019

I'm having some difficulty deciding what to provide for the layout uri in ocfl_layout.json. Given the description:

A URI identifying the precise arrangement of directories and OCFL objects under the storage root, i.e. how OCFL object identifiers are mapped to directory hierarchies. If a URL is used then it should resolve to a detailed specification of the arrangement.

... It appears this URI is intended to be used by software clients so that the correct path algorithm is used when creating new objects in an OCFL root. The implementation notes give some examples of such algorithms: flat, pairtree, and truncated n-tuple, but does not mention any recommended URIs to refer to these algorithms in ocfl_layout.json. Furthermore, each algorithm has has some sort of choice involved; the value of "N" in truncated n-tuples, or the choice of how to terminate a pairtree, or the choice of whether and how to encode IDs using the "flat" approach. Should these choices be reflected in the URI somehow?

If interoperability between clients is a goal of the URI, perhaps the implementation notes should provide a list of URIs or conventions for specifying well-known algorithms.

As an example, it might be reasonable to use https://tools.ietf.org/html/draft-kunze-pairtree-01 , or maybe https://confluence.ucop.edu/display/Curation/PairTree for pairtrees. An implementer might pick https://tools.ietf.org/html/draft-kunze-pairtree-01 as it superficially looks more durable. The pairtree spec does does not advocate a specific method to encapsulate an object directory, and notes

Practice will vary according to local custom as to how to name the
encapsulating object directory beneath that last shorty. Its name is
completely independent of the object identifier. For example, every
object directory in a pairtree could have the uniform name "thingy"

So let's say I want to terminate all directories with "obj", so an ID 13030_45xqv_793842495 would result in a directory 13/03/0_/45/xq/v_/79/38/42/49/5/obj. If the uri in object_layout.json is intended to convey this nuance, then maybe I might pick a URL like https://tools.ietf.org/html/draft-kunze-pairtree-01?terminal=obj.

or maybe I want a 9-character substring of the original ID, so one might use https://tools.ietf.org/html/draft-kunze-pairtree-01?substr=9.

But at this point, I feel like I'm just making things up, and don't want to go much farther without advice.

So I suppose my specific questions are:

  • Is the uri field of object_layout.json intended to be interoperable between clients?
  • Should there be a list of expected URLs to use in ocfl_layout.json for well-known path layout algorithms?
  • Is the URL intended to reflect specific choices that affect the algorithm?
@ahankinson
Copy link
Contributor

#249 is probably relevant to this, if that's helpful

@rosy1280 rosy1280 added this to the 1.0 milestone May 21, 2019
@ahankinson
Copy link
Contributor

Discussed this in the Editors meeting. We'll wait for a response from @birkland to see if the linked issues answer his question.

@birkland
Copy link
Contributor Author

OK, from the linked issue(s) I derive the following answers:

  • Is the uri field of object_layout.json intended to be interoperable between clients?
    • There is no specific intention. If a registry or spec develops outside the OCFL effort that focuses on standardizing such values, it may be used.
  • Should there be a list of expected URLs to use in ocfl_layout.json for well-known path layout algorithms?
    • There is no plan for such a list to be part of the implementation notes at present
  • Is the URL intended to reflect specific choices that affect the algorithm?
    • Most likely. It is this complexity that leads the editors to choose not to tackle the specifics of this field

So at this point, I think I'll just ask around (@tomwrobel ?) to see if anybody has any opinions on uri values or conventions. Failing that, just put a stake in the ground and just think up something reasonable for the purposes of the client I'm writing. I think once there is a critical mass of folks using OCFL, some de facto convention will emerge.

Please close this issue if if the editors think this is a good way forward.

@tomwrobel
Copy link

@birkland apologies, I'm just coming across this.

I think it's likely that, as you say, some de facto convention will emerge. However, I think it's probably worth guiding the process by example if possible. Would it be possible to a) publish the layout you're using and b) for the implementation notes to link to this layout?

It would be a shame if everyone ended up using a pairtree-from-the-identifier approach, but we all ended up with slightly different implementations of it! It's the sort of thing that's very easy to change when first writing code, and very hard to tweak once there are deployed OCFL stores

@rosy1280
Copy link
Contributor

@birkland is working on a wiki page here that will allow community members to link to their URIs. This will be closed once that wiki page exists (and is linked to here).

@tomwrobel
Copy link

@birkland I see that I'm supposed to be working with you on this :) Very happy to collaborate - let me know what date/time suits to discuss

@birkland
Copy link
Contributor Author

birkland commented Jul 23, 2019 via email

@birkland
Copy link
Contributor Author

birkland commented Aug 6, 2019

I've been thinking more about this, and I'm beginning to think a github repo for community-contributed RFCs has some nice characteristics over a wiki, particularly as pertains to comment/review (via PRs) before publication.

I created a demo github repo with some example RFCs (and a PR for one) in it, along with a README with a little more explanation:
https://github.com/birkland/ocfl-rfc-demo

an example RFC on github pages:
https://birkland.github.io/ocfl-rfc-demo/0001-pairtree-layout

What do folks (@tomwrobel , editors, etc) think about an RFC repo in the OCFL organization>
(OCFL/rfcs?)

@zimeon
Copy link
Contributor

zimeon commented Oct 2, 2019

Following on from @birkland's suggestion and #365 we have created https://github.com/OCFL/extensions, a PR there would be welcome!

Can we close this issue now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants