Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object and inventory versioning #475

Closed
pwinckles opened this issue May 21, 2020 · 6 comments · Fixed by #479
Closed

Object and inventory versioning #475

pwinckles opened this issue May 21, 2020 · 6 comments · Fixed by #479

Comments

@pwinckles
Copy link

pwinckles commented May 21, 2020

The discussion in #474 raised a number of additional versioning related questions.

  1. Is there a distinction between the object conformance declaration and the inventory type definition? That is, does one represent the version of the object and the the other the inventory, or do they both represent the version of the object as a whole?
  2. MUST the OCFL version in the object conformance declaration match the version in the root inventory type definition?
  3. Can an object created in one version of the spec later be updated to a different version?
  4. If the OCFL version in an inventory and the object conformance declaration versions can change, independently or otherwise, then the versions in historic inventories may no longer match the object conformance declaration. Is this a problem? Should there be a namaste file in each version directory?
@ahankinson
Copy link
Contributor

  1. The object conformance declaration is meant to be the authoritative source of the version, but the presence of the type value is meant to signal the validation conformance of the inventory itself, and to provide an additional redundancy if the inventory were to sit outside of the object (e.g., if an org wanted to just download the inventory from an external storage service)
  2. Yes
  3. No; not least because we wouldn't necessarily be able to guarantee that the versioning mechanism will stay consistent across OCFL versions, and that the introduction of features like distributed storage in later versions might make backwards compatibility impossible.
  4. I think the recommended path forward for new versions of OCFL Objects would be to either separate the storage roots and have clients understand the object versions in them (since the Storage Root declaration is also OCFL Versioned) if you wanted to be conservative about your data, or do a migration of your data to the new versions. I think trying to make a single OCFL object both forward and backwards compatible is not really something we should commit to, simply because we just don't have enough experience to make that claim and know whether it is actually possible.

I think it is also important to recognize that we are defining a data packaging format, and not a piece of software. There is value in keeping data structures stable, since we need predictable behaviours from it. This is not the same values that we have in software, where the emphasis is more on constant improvement of the system. If a software package goes years without updates, it's generally considered stale. If data goes years without updates, it's stable. The 'b' makes a big difference. :-)

I fully understand that there is a tension for the people who have to write constantly improving software, whose lifespan is generally under a decade, against a stable data format whose lifespans should be measured in multiple decades. But I also think it's shortsighted to design these data formats with the idea that they should change at the same rate as we think software will change. It we do, we don't actually solve the problem that we're trying to solve, which is to reduce long-term risk to data by decoupling the storage format from the application(s) that manage it.

@zimeon
Copy link
Contributor

zimeon commented May 21, 2020

I think this issue should be deferred. Currently we are working toward v1.0. Both the conformance declaration file in the object root and the type in each version must specify 1.0.

@pwinckles
Copy link
Author

At this point, I don't have a personal opinion about how I would like OCFL to behave here. I raised these questions because, as it currently stands, the intent of the spec is ambiguous, and, based on the discussion in the other ticket, there is not a shared editorial understanding of what the intent is either.

I don't necessarily have a problem deferring these decisions until there actually are multiple OCFL spec versions in existence. However, I will say that to me it seems like there is a conflict between what @ahankinson was saying about stable data formats and taking an iterative approach to the OCFL specification. I'm not saying that either is wrong, but it seems to me like there are plans for future OCFL spec versions within the next, say, 2-5 years (maybe?). And, if this is the case, it does not seem like a stable format, and, if I had migrated to OCFL 1.0 only to later find out a couple years down the road that I would need to rewrite all of my objects if I wanted them to make use of OCFL 2.0 features, then I might be a little miffed.

@ahankinson
Copy link
Contributor

But the 1.0 spec won't change, so any content that is produced as 1.0 will stay as 1.0. Unless there is something you need in 2.0, there is no real reason to upgrade your data to the latest version of the spec.

@pwinckles
Copy link
Author

Fair enough (and I mostly agree with your positions both in this ticket and the other)! For the sake of clarity of OCFL implementations, I still think it would be helpful if the spec picked a side in 1.0, even though there is only one version.

For example, the root conformance section already states:

OCFL Objects within the OCFL Storage Root also include a conformance declaration which MUST indicate OCFL Object conformance to the same or earlier version of the specification.

I can write code for that. Is object conformance <= root conformance.

On the other hand, the spec is silent about the relationship between the object conformance and inventory type, both in the root and in versions. The only code that I could reasonable write is to test that they equal 1.0. In this case, that's accurate, but it's also arbitrary. I would rather the spec stated if inventory files must reference the same spec version as their object conformance. It's fine if the requirement changes in some later version. At least the intent is clear from one spec version to the next.

@zimeon zimeon added this to the 1.0 milestone Jun 11, 2020
@zimeon
Copy link
Contributor

zimeon commented Jun 11, 2020

Based on discussion with @pwinckles on the community call I think the key point here that we should consider and decide before releasing v1.0 is whether we want to be explicit that the object conformance declaration MUST match the version of the type in the Object Root inventory (not necessarily for prior versions).

In https://ocfl.io/draft/spec/#version-inventory perhaps change:

type
A type for the inventory JSON object that also serves to document the OCFL specification version that the inventory complies with. This must be the URI of the inventory section of the specification, https://ocfl.io/1.0/spec/#inventory.

to

type
A type for the inventory JSON object that also serves to document the OCFL specification version that the inventory complies with. In the object root inventory this MUST be the URI of the inventory section of the specification version matching the object conformance declaration. For the current specification version the value is https://ocfl.io/1.0/spec/#inventory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants