-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add to implementation notes a discussion of the idea of temporary space while building OCFL Objects #320
Comments
I wonder whether there might be a case for a normative (specification) addition that says some particular directory (maybe |
Sounds like a similar description for the current |
I toyed with the idea of placing the in-progress directory somewhere under the A temp directory (or some sort of convention, like directories under content that begin with a dot |
What about |
Something along the lines of |
A temporary space on the storage root, not named /tmp, for the assembly of future OCFL object versions. |
we agreed to use the language in @zimeon second comment, except the directory should be named |
Hmmm... as @rosy1280 just pointed out on my abortive PR #324, the #320 (comment) above suggests one Having one directory per storage root suddenly couples updates to different objects under the root in a potentially awkward way. It also doesn't provide a standard solution for within-object manipulations where they perhaps aren't following the storage root approach. (I do understand that in a filesystem implementation a whole root would likely be on one filesystem and thus move from a single |
A global |
Although not detailed in the 2019.03.20-Editors-Meeting notes, the concern in that meeting with having a |
I think my answer to that concern is that it is optional to use a workspace within the object. We have at least one example (see #320 (comment)) of a choice to do this. |
@zimeon the benefit of the deposit directory at the storage root is that it would keep the object itself (and its versions) clean until a version can be moved out of the deposit directory. in the example you cite, we have to futz with the object to remove a version if processes stall mid way through creation. with the deposit directory at the storage root you don't. as for how the deposit directory would work at a global level, you would need to create the hierarchy that you create for the regular storage root in the deposit directory. what i mean by that is if you're creating a pair tree hierarchy, then create the pair tree hierarchy for the object that you put in the deposit directory. if you have no hierarchy, then don't create a hierarchy. as an fyi this is how Moab does deposits as well. (note i edited this comment because now i see other comments that weren't appearing before) |
Personally, my feeling is that the location of temporary space allocated for assembling new versions should be left up to the client to implement. There are too many use cases and edge cases for us to properly understand this. Some implementers may be satisfied to use the |
I agree, @ahankinson . From a validation perspective, however, we should include in the specification locations that should be ignored. |
My first inclination would be to say that nothing is allowed in the Object Root, save for what we have specified. Any application-specific logic (and I would consider an 'in-flight' temporary directory application-specific) should not be stored with the content to be preserved. |
I don't think we should be prescriptive about how implementations do their manipulation of OCFL Objects alone or within an OCFL Storage Root. However, we should enable implementations to do it in the way they choose. I see three options and I advocate that all should be possible within spec:
|
@julianmorley raised the problem though, and I agree with him, that allowing incomplete or failed 'commits' within the "preservation" storage would seriously gum up the works in the long term. It's not just validation, it's also clarity of purpose -- OCFL Objects are "object at rest", not "object in motion." We've made the distinction quite clear by having Spec and Implementation Notes; I think we would be making a big step backwards if we were to start muddying it up this close to the finish line. So I would be big thumbs-down to 3, and little thumbs-down to 2. |
To be clear, my implementation currently just writes content directly into the next In any case, I think best practices will emerge from experience. I think the Fedora project might need to tweak or re-think their anticipated use cases for working with un-versioned content (where it is expected to change or otherwise be volatile before committed to a version), but that's neither here nor there. In general, the possibility of failed or incomplete "commits" are unavoidable due to the fact that there is always some degree of motion in an object's lifetime as files fall into place (which can be mitigated to some extent by leveraging atomic renames), but it's proper for the spec as "object at rest" to be silent about that and just describe the expected states. |
It seems that the consensus is that we should not allow anything in the Object (no change to spec required) and allow a |
I also wonder if we should add something to the implementation notes discussing this topic. |
From discussion in 2019.03.13 Community Meeting there might be a need for a "draft" or "tmp" directory for active OCFL Objects.
The text was updated successfully, but these errors were encountered: