Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collapsing OCFL Object Versions #46

Open
awoods opened this issue Oct 12, 2023 · 9 comments
Open

Collapsing OCFL Object Versions #46

awoods opened this issue Oct 12, 2023 · 9 comments
Labels
Component: Specification Confirmed: In-scope Use case will be included in the upcoming version of the spec or implementation notes.

Comments

@awoods
Copy link
Member

awoods commented Oct 12, 2023

In the case where versions of an OCFL object have been created that are not considered curatorially significant, there are times where it would be useful to have OCFL support in collapsing those versions. For example in the object below, if the versions 3-6 are considered to be one curatorially significant version of the object...

[object root]
    ├── 0=ocfl_object_1.1
    ├── inventory.json
    ├── inventory.json.sha512
    ├── v1
    │   ├── inventory.json
    │   └── ...
    ├── v2
    │   ├── inventory.json
    │   └── ...
    ├── v3
    │   ├── inventory.json
    │   └── ...
    ├── v4
    │   ├── inventory.json
    │   └── ...
    ├── v5
    │   ├── inventory.json
    │   └── ...
    ├── v6
    │   ├── inventory.json
    │   └── ...
    ├── v7
    │   ├── inventory.json
    │   └── ...
    └── v8
        ├── inventory.json
        └── ...

It would be helpful to be able to collapse those versions, such as:

[object root]
    ├── 0=ocfl_object_1.1
    ├── inventory.json
    ├── inventory.json.sha512
    ├── v1
    │   ├── inventory.json
    │   └── ...
    ├── v2
    │   ├── inventory.json
    │   └── ...
    ├── v3 (contains collapsed result of previous versions 3-6)
    │   ├── inventory.json
    │   └── ...
    ├── v4 (previous v7)
    │   ├── inventory.json
    │   └── ...
    └── v5 (previous v8)
        ├── inventory.json
        └── ...
@rosy1280 rosy1280 added Proposed: In-Scope Use case is up for discussion and may change the spec, implementation notes, or become an extension. Component: Specification and removed Proposed: In-Scope Use case is up for discussion and may change the spec, implementation notes, or become an extension. labels Oct 26, 2023
@rosy1280
Copy link
Contributor

Feedback on Use Cases

In advance of version 2 of the OCFL, we are soliciting feedback on use cases. Please feel free to add your thoughts on this use case via the comments.

Polling on Use Cases

In addition to reviewing comments, we are doing an informal poll for each use case that has been tagged as Proposed: In Scope for version 2. You can contribute to the poll for this use case by reacting to this comment. The following reactions are supported:

In favor of the use case Against the use case Neutral on the use case
👍🏼 👎🏼 👀

The poll will remain open through the end of February 2024.

@srerickson
Copy link

srerickson commented Oct 30, 2023

This use case describes a type of transformation to an OCFL object. My understanding is that the OCFL spec doesn't really concern itself with transformations, but with "objects at rest". I don't see why collapsing versions can't be implemented if needed under v1 (e.g., through a re-ingest process, similar to what's required for purging content) or how v2 might be designed to make this easier.

@awoods
Copy link
Member Author

awoods commented Nov 6, 2023

Agreed. This use case may inform implementation notes and (ideally) implementation in OCFL libraries.

@slabrams
Copy link

slabrams commented Nov 6, 2023

I strongly endorse this use case. The design of our repository system predates OCFL. As a result, routine object editing operations can result is a large number of granular, but curatorially-meaningless versions. This needlessly complicates the object's internal structure, increases inventory sizes, and - well - is just an inelegant outcome. Even if the solution is just a formal statement of implementation best practice for collapsing versions, that is important for purposes of getting such practice integrated into well-known OCFL tools and library packages. It is useful if we can all try to converge on doing the same things the same way.

@pwinckles
Copy link

Prior relevant discussion: OCFL/spec#367

The implementation notes currently discourage this:

Previous versions of an object should be considered immutable since the composition of later versions of an object may be dependent on them. In addition, the assumption of immutability ensures that copies of different versions of an object remain consistent with each other, avoiding issues with identifying canonicity and reconciliation.

https://ocfl.io/1.1/implementation-notes/#version-immutability

@scossu
Copy link

scossu commented Feb 9, 2024

I support this use case too but I am concerned about the change in the following version names. Is that necessary to keep a mandatory gap-less sequence?

In that case, this may reveal a weakness in OCFL's reliance on version name semantics to build the object's history. An explicit chain of links in a linked list style would allow for more flexibility (e.g. one may want name a version by UUID, or checksum), and in this case, wouldn't require changing the location of versions not affected by the change.

Has this been discussed elsewhere?

@scossu
Copy link

scossu commented Feb 9, 2024

To @pwinckles 's point, if instead of entirely deleting the version folder, a tombstone is left with a reference to the closest valid version, a consumer could be both notified of the deletion and be redirected to a useful resource (if not exactly the same, but as we know, in the real world we do delete things and OCFL should acknowledge that).

@rosy1280
Copy link
Contributor

at the time of this comment the vote tallied to +1. we are marking this as in-scope for version 2 because we suspect it will be related to #42 which is already in scope for version 2 of the specification

@rosy1280 rosy1280 added Confirmed: In-scope Use case will be included in the upcoming version of the spec or implementation notes. and removed Proposed: In-Scope Use case is up for discussion and may change the spec, implementation notes, or become an extension. labels Feb 29, 2024
@zimeon zimeon added this to the Supported in v2.0 milestone Feb 29, 2024
@zimeon
Copy link
Contributor

zimeon commented Sep 20, 2024

2024-09-20 Editors’ discussion: Two distinct scenarios have different solutions:

  1. The desire to collapse versions and delete intermediate file revisions where there is no need or desire to keep the details of the historical changes because they are curatorially insignificant. If versions have been created rather than using an approach such as mutable head (see https://ocfl.github.io/extensions/0005-mutable-head.html) then such changes unavoidably mutate the object. We think this is best handled by rewriting the object with selected versions removed, taking care to keep all necessary/interesting changes. This approach needs no new specification support but additional implementation notes.
  2. The desire to delete intermediate files (perhaps to save storage) but to retain the history of versions. This is handled by the Support Physical File-Deletion use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Specification Confirmed: In-scope Use case will be included in the upcoming version of the spec or implementation notes.
Projects
None yet
Development

No branches or pull requests

7 participants