-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defining a repository from peer storage roots #43
Comments
IMO this is distinct from OCFL/spec#22. I think the idea of having a way to describe that a storage root contains partial content for a "repository" that is spread across multiple storage roots, or that one or more replica copies of a storage root exist, is interesting. I think there are perhaps different requirements for these two use cases. For example, the notions of |
Feedback on Use CasesIn advance of version 2 of the OCFL, we are soliciting feedback on use cases. Please feel free to add your thoughts on this use case via the comments. Polling on Use CasesIn addition to reviewing comments, we are doing an informal poll for each use case that has been tagged as
The poll will remain open through the end of February 2024. |
I understand needing data to be distributed across many storage locations/options, but I'm not sure I totally understand why OCFL needs to be aware of this. It would be helpful to hear more about what is gained by having all the storage roots in one OCFL repository, versus having an application layer above OCFL be aware of multiple repositories. Would the OCFL specification be moving towards handling additional functions like replication, tiering and load balancing, or is it primarily for ease of discovery by a client without needing to keep track of multiple repositories? |
I agree with @bbpennel -- this feels to me like functionality that doesn't need to be part of the core spec. Perhaps there is a reason this can't be implemented as an extension, but I don't see it. |
Wow - this is a blast from the past! I've long since moved on from that project but we decided quite a while ago to stop using OCFL altogether. The complexity of the spec and the compromises we were required to accept just didn't stack up. I don't know if the project will reconsider OCFL in the future but I do know it won't be using the architecture described in this ticket (which we weren't happy about in any case) so I think this can be canned. |
I agree with other comments that this should not be part of the core OCFL specification. I think we would need to see experiments combining individually valid OCFL Storage Roots to explore what would be needed at the core level and could not be implemented through a separate higher-level specification |
2024-02-29 Editors' agree that we will close as out of scope. Comments do not support inclusion in the spec and the original institutional use case no longer applies. Voting at time of closing is -2. |
[Moved from the spec issues repository as this describes a new use case of handling multiple storage roots making up one repository. It includes both the aggregation of content in multiple storage roots and possibly replication of content.]
This may be a part of issue OCFL/spec#22 and it certainly follows on from the comment.
My institution can't provide a single 200TB volume (!). But they can give me 2 x 70TB and a 60TB volume. So for my use case I now need to have 3 OCFL filesystems that I interact with as a single unit from my service.
Given this, it would be nice to be able to define metadata at the repository level that says this filesystem is a part of a larger set of peers. Nice to haves would include defining a priority for each peer and perhaps the storage tier. That way, clients can make smart decisions about ranking peers by tier and then priority (I imagine these are properties defined by the administrators provisioning the storage).
The justification for this is that any connecting service or user inspecting the filesystem can identify that it is part of a larger set.
For example - a
storage.json
or some such with content like:In this model priority can be any sequential number and class could be 'hot', 'warm', 'cold' to dovetail with typical nomenclature used in the industry.
The text was updated successfully, but these errors were encountered: