From 746ed104b200d19ae8ed92e811639cfa61583997 Mon Sep 17 00:00:00 2001 From: Simeon Warner Date: Thu, 7 Nov 2024 12:22:23 -0500 Subject: [PATCH] Release v1.1.1 (#654) * Start work for v1.1.1 * Create 1.1.1 redirects, add changes log * Update change log and annoucnement * Change news link to say to update * Fix validation codes, and v1.1.1 redirect * Fixup bad merge * Fixup bad merge * Fixup version history tables * Address @neilsjeffries comments * Change release date to today --- 1.1.0/implementation-notes/index.md | 683 ++++++++++++ 1.1.0/spec/change-log.md | 108 ++ 1.1.0/spec/index.md | 1375 +++++++++++++++++++++++++ 1.1.0/spec/validation-codes.md | 142 +++ 1.1.1/implementation-notes/index.html | 5 + 1.1.1/spec/index.html | 5 + 1.1.1/spec/validation-codes.html | 5 + 1.1/implementation-notes/index.md | 54 +- 1.1/spec/change-log.md | 48 +- 1.1/spec/index.md | 69 +- 1.1/spec/validation-codes.md | 1 - draft/implementation-notes/index.md | 2 +- draft/spec/index.md | 2 +- index.md | 5 +- news/index.md | 37 +- 15 files changed, 2483 insertions(+), 58 deletions(-) create mode 100644 1.1.0/implementation-notes/index.md create mode 100644 1.1.0/spec/change-log.md create mode 100644 1.1.0/spec/index.md create mode 100644 1.1.0/spec/validation-codes.md create mode 100644 1.1.1/implementation-notes/index.html create mode 100644 1.1.1/spec/index.html create mode 100644 1.1.1/spec/validation-codes.html diff --git a/1.1.0/implementation-notes/index.md b/1.1.0/implementation-notes/index.md new file mode 100644 index 0000000..b1ef63d --- /dev/null +++ b/1.1.0/implementation-notes/index.md @@ -0,0 +1,683 @@ +--- +no_site_title: true +--- +OCFL Hand-drive logo +# Implementation Notes, Oxford Common File Layout Specification +{:.no_toc} + +7 October 2022 + +**This Version:** +* + +**Latest Published Version:** +* + +**Editors:** + +* [Neil Jefferies](https://orcid.org/0000-0003-3311-3741), [Bodleian Libraries, University of Oxford](http://www.bodleian.ox.ac.uk/) +* [Rosalyn Metz](https://orcid.org/0000-0003-3526-2230), [Emory University](https://web.library.emory.edu/) +* [Julian Morley](https://orcid.org/0000-0003-4176-1933), [Stanford University](https://library.stanford.edu/) +* [Simeon Warner](https://orcid.org/0000-0002-7970-7855), [Cornell University](https://www.library.cornell.edu/) +* [Andrew Woods](https://orcid.org/0000-0002-8318-4225), [Harvard University](https://library.harvard.edu/) + +**Additional Documents:** + +* [Specification](https://ocfl.io/1.1/spec/) +* [Validation Codes](https://ocfl.io/1.1/spec/validation-codes.html) +* [Extensions](https://github.com/OCFL/extensions/) + +**Previous version:** +* + +**Repository:** +* [Github](https://github.com/ocfl/spec) +* [Issues](https://github.com/ocfl/spec/issues) +* [Commits](https://github.com/ocfl/spec/commits) +* [Use Cases](https://github.com/ocfl/Use-Cases) + +This document is licensed under a [Creative Commons Attribution 4.0 +License](https://creativecommons.org/licenses/by/4.0/). [OCFL logo: +"hand-drive"](https://avatars0.githubusercontent.com/u/35607965) by +[Patrick Hochstenbach](http://orcid.org/0000-0001-8390-6171) is +licensed under [CC BY 2.0](https://creativecommons.org/licenses/by/2.0/). + +## Introduction +{:.no_toc #abstract} + +_This section is non-normative._ + +This document provides guidance on implementation of the \[[OCFL-Specification](#ref-ocfl-specification)\] for how +clients should behave when operating on OCFL Objects. + +## Table of Contents +{:.no_toc #table-of-contents} + +* TOC placeholder (required by kramdown) +{:toc} + +## 1. Digital Preservation +{: #digital-preservation} + +### 1.1 Rebuildability +{: #rebuildability} + +A key goal of the OCFL is the rebuildability of a repository from an OCFL storage root without additional information +resources. Consequently, a key implementation consideration should be to ensure that OCFL objects contain all the data +and metadata required to achieve this. With reference to the \[[OAIS](#ref-oais)\] model, this would include all the +descriptive, administrative, structural, representation, and preservation metadata relevant to the object. + +Additionally, as an aid to those who may need to recover OCFL objects in the future, it is recommended that a copy of +the \[[OCFL-Specification](#ref-ocfl-specification)\] is stored in the top level of the OCFL storage root. The OCFL +ignores files other than the conformance declaration at the top level so it is a good location to store documentation +that may be useful for recovery. + +A more complete approach would be to create a specific OCFL object that contains this documentation and to have a +pointer to its location in the storage root. This documentation object would then be subject to OCFL validation and any +other digital preservation processes that might be implemented without requiring special handling. + +### 1.2 Fixity +{: #fixity} + +The digests in the manifest are used by the OCFL for content addressability rather than fixity but they are suitable for +use as part of a fixity regime, and the manifest block usefully identifies all the files in an object. OCFL validation +also requires that digests and files match. However, while the characteristics of digest algorithms that make them +suitable for fixity checking and content addressing are closely related, they are not identical. In particular, fixity +against malicious tampering requires that a digest computation is hard to reverse, which is not a requirement for +content addressing. It is this aspect which is the most frequent target for cryptoanalytic attack. + +Consequently, it is sensible to allow additional or alternative fixity algorithms to be used. These may be made in a +[fixity block](../spec/#fixity) which has the same layout as a manifest block but permits a broader range of algorithms. +The OCFL will consider a fixity block valid if all the files referenced in the block exist but the OCFL does not +validate digests for all possible algorithms. The fixity block does not have to include all the files in an object to +permit legacy fixity to be imported without requiring continued use of obsolete digest algorithms. + +## 2. Storage +{: #storage} + +### 2.1 Object Contents +{: #object-contents} + +The OCFL separates the content path of stored files from the logical path of these files' content in OCFL object +versions. This is a key feature that allows previous versions of objects to remain immutable but permitting +deduplication, forward delta differencing, and easy file renaming. Consequently, the OCFL only requires that files added +to any version of an OCFL object must be stored somewhere within the relevant version directory, with a corresponding +entry in the manifest block. An entry in the state block determines the path and name of the file within that version by +referencing the manifest entry, not the actual path on disk. + +The most transparent approach is to have the path used to store the file on disk the same as the path of the file within +the object when accessioned. This is readily understandable in terms of visual inspection of the physical filesystem. + +However, this is not always possible. For example, complex objects with deep file hierarchies may encounter issues if +they come from a fileystem that allows longer paths than are supported by the target OCFL system. In this case, the +decoupling between content paths and logical paths in OCFL objects allows the use of truncated paths for storage while +the full paths can be preserved in state block entries which are not length constrained. + +Another use case is importing content from other repository systems which renames files on ingest and stores them in a +flat hierarchy. These can be imported, as is, and the original paths and file names recorded through suitable state +block entries rather than reconstructing a physical file layout. Of course, the OCFL supports ongoing use of such a +methodology. + +#### 2.1.1 Data and Metadata +{: #data-and-metadata} + +OCFL object versions are composed of series of files/bitstreams but the OCFL does not make any distinction between +different types of files other than those reserved for OCFL functionality: the inventory, its digest file, and +conformance declaration files. It is possible, for example, to create separate data and metadata directories within each +version to help organize material but all files are treated equally for the purpose of OCFL validation and management. + +#### 2.1.2 Deduplication +{: #deduplication} + +The OCFL supports optional deduplication if a client ensures that all digests in the manifest block refer to a single +file path on disk. This entry is created the first time file content is stored in an OCFL Object. Subsequent references +to that file content should then occur in the state block only. This can be determined by computing the digests of +incoming files and determining if they already exist in the manifest block. + +If deduplication is carried out within an object then, for consistency, it is expected that Forward Delta differencing +will also be used between object versions so subsequent references to duplicated content should also refer back to the +original manifest entry rather than updating it to include additional references. + +#### 2.1.3 Filesystem metadata +{: #filesystem-metadata} + +Filesystem metadata (e.g. permissions, access, and creation times) are not considered portable between filesystems or +preservable through file transfer operations. Nor can these attributes be validated in terms of fixity in a consistent +manner. As such, the OCFL neither explicitly supports nor expects that these attributes remain consistent. If retaining +this metadata is important then files should either be encapsulated in a filesystem image format that preserves this +information, or the metadata extracted and stored explicitly in an additional file. + +#### 2.1.4 Empty Directories +{: #empty-directories} + +The OCFL preserves files and their content, with directories serving as a useful organizational convention. An empty +directory consists only of filesystem metadata and therefore, as noted above, is not amenable to direct preservation in +OCFL objects. If the preservation of empty directories is considered essential then the suggested route is to insert a +zero length file named `.keep` into the directory which will ensure directories are preserved as part of the file's +path. + +Note that `.keep` files are not considered special by the OCFL in any way and are treated exactly the same way as other +files. As such, a non-zero length `.keep` file is not considered invalid. + +#### 2.1.5 Objects with Many Small Files +{: #objects-with-many-small-files} + +Objects that contain a large number of files can pose performance problems if they are stored in a filesystem as-is. +Fixity checks, object validation and version creation can require an OCFL client to process all the files in an object +which can be time consuming. Additionally, most storage systems have a minimum block size for allocation to files, so a +large number of small files can end up occupying a volume of storage significantly larger than the sum of the individual +file sizes. In this case, assuming that the majority of the files are relatively static data that is unlikely to change +between objects versions, it is sensible to package the static files together in a single, larger file (zip is +recommended). This can be parsed to extract individual files if necessary but can significantly improve the efficiency +of basic OCFL client and storage operations. + +### 2.2 Storage Root Hierarchy +{: #storage-root-hierarchy} + +Strictly speaking, the OCFL only requires that an OCFL Storage Root contains OCFL Objects in directories, distributed in +some manner in the underlying filesystem. In turn, an OCFL object is identified purely by the presence of a +\[[NAMASTE](#ref-namaste)\] conformance file in the object root. The presence and correctness of inventory files and +version directories are a validation rather than an identification concern. + +These definitions allow a lot of freedom as to how objects are arranged beneath an OCFL Storage Root and, while there is +no strict requirement for all OCFL Objects to be arranged according the same system it is nevertheless considered good +practice to do so. In addition, in the interests of rebuildability, it would be prudent to include an indication of the +details of this arrangement alongside the OCFL specification as described in the Rebuildability section. + +In the interests of transparency it makes sense for an object's URI, its unique identifier and its location under the +OCFL Storage Root to be aligned and simply derivable from each other. Good examples include: + +* Flat: Each object is contained in a directory with a name that is simply derived from the unique identifier of the +object, possibly with the escaping or replacement of characters that are not permitted in file and directory names. +While this is a very simple approach, most filesystems begin to encounter performance issues when directories contain +more than a few thousand files so this arrangement is best suited to repositories with a small number of objects (or +many OCFL Storage Roots). +``` +[storage_root] + ├── 0=ocfl_1.1 + ├── ocfl_1.1.html (optional copy of the OCFL specification) + ├── d45be626e024 + | ├── 0=ocfl_object_1.1 + | ├── inventory.json + | ├── inventory.json.sha512 + | └── v1... + ├── d45be626e036 + | ├── 0=ocfl_object_1.1 + | ├── inventory.json + | ├── inventory.json.sha512 + | └── v1... + ├── 3104edf0363a + | ├── 0=ocfl_object_1.1 + | ├── inventory.json + | ├── inventory.json.sha512 + | └── v1... + └── ... +``` + +* PairTree: \[[PairTree](#ref-pairtree)\] is designed to overcome the limitations on the number of files in a directory +that most file systems have. It creates hierarchy of directories by mapping identifier strings to directory paths two +characters at a time. For numerical identifiers specified in hexadecimal this means that there are a maximum of 256 +items in any directory which is well within the capacity of any modern filesystem. However, for long identifiers, +pairtree creates a large number of directories which will be sparsely populated unless the number of objects is very +large. Traversing all these directories during validation or rebuilding operations can be slow. +``` +[storage_root] + ├── 0=ocfl_1.1 + ├── ocfl_1.1.html (optional copy of the OCFL specification) + ├── d4 + | └── 5b + | └── e6 + | └── 26 + | └── e0 + | ├── 24 + | | └──d45be626e024 + | | ├── 0=ocfl_object_1.1 + | | └── ... + | └── 36 + | └──d45be626e036 + | ├── 0=ocfl_object_1.1 + | └── ... + ├── 31 + | └── 04 + | └── ed + | └── f0 + | └── 36 + | └── 3a + | └── 3104edf0363a + | ├── 0=ocfl_object_1.1 + | └── ... + └── ... +``` + +* Truncated n-tuple Tree: This approach aims to achieve some of the scalability benefits of PairTree whilst limiting the +depth of the resulting directory hierarchy. To achieve this, the source identifier can be split at a higher level of +granularity, and only a limited number of the identifier digits are used to generate directory paths. For example, using +triples and three levels with example above yields: +``` +[storage_root] + ├── 0=ocfl_1.1 + ├── ocfl_1.1.html (optional copy of the OCFL specification) + ├── d45 + | └── be6 + | └── 26e + | ├──d45be626e024 + | | ├── 0=ocfl_object_1.1 + | | └── ... + | └──d45be626e036 + | ├── 0=ocfl_object_1.1 + | └── ... + ├── 310 + | └── 4ed + | └── f03 + | └── 3104edf0363a + | ├── 0=ocfl_object_1.1 + | └── ... + └── ... +``` + +Some identifier schemes may require transformation before such approaches can be used effectively. A simple example +would be sequentially assigned identifiers, which would not distribute objects within the filesystem evenly. Hash +functions may be used to provide a unidirectional mapping between URI/PID and filesystem path, as required by the OCFL. +Encryption algorithms may be used to provide a bi-directional mapping which may be a useful aid to human readability. +Relevant details should be referenced in `ocfl_layout.json` in the Storage Root. + +### 2.3 Filesystem Features +{: #filesystem-features} + +In order to be portable across as many filesystems as possible, the OCFL makes use of a subset of filesystem features +that are very broadly supported. It is therefore strongly advised to not use additional features in OCFL Storage Roots +since OCFL clients and other filesystem tools that need to operate between different filesystems may exhibit +unpredictable behaviour when feature sets do not match. In particular, using features such as hard and soft (symbolic) +links for deduplication can work at odds with the OCFL's own mechanisms and should be avoided. + +Consideration should also be given to calculations of storage usage when migrating between filesystems. Many back-end +filesystem features, which are essentially invisible to user-space code, can have a significant impact on the actual +consumption of storage space compared with the a simple sum of file sizes. Compression, extents and block sub-allocation +are examples of such features which, while providing benefits in terms of storage efficiency, do require care when +considering issues of capacity planning or migration. + +## 3. Client Behaviors +{: #client-behaviors} + +### 3.1 Basic File Operations +{: #basic-file-operations} + +The OCFL and its inventory structure are designed to support and capture the following file operations that create OCFL +versions, regardless of whether optional features, such as deduplication, are used. The OCFL is not concerned with the +process of creating versions but only the final outcome in terms of the differences with the previous version that need +to be recorded and preserved. + +* Inheritance: By default a new version of an OCFL Object inherits all the filenames and file content from the previous +version. This serves as the basis against which changes are applied to create a new version. A newly created OCFL +Object, obviously, inherits no content and is populated by file additions. + +* Addition: Adds a new file path and corresponding content to an OCFL Object. The path cannot exist in the previous +version of the object, and the content cannot have existed in any earlier versions of the object. + +* Updating: Changes the content pointed to by an content path. The path must exist in the previous version of the OCFL +Object, and the content cannot have existed in any earlier versions of the object. + +* Renaming: Changes the file path of existing content. The path cannot exist in the previous version of the OCFL Object, +and the content cannot have existed in any earlier versions of the object. + +* Deletion: Removes a file path and corresponding content from the current version of an OCFL Object. The path and +content remain available in earlier versions of the object. + +* Reinstatement: Makes content from a version earlier than the previous version available in the current version of an +OCFL Object. The content must exist in an earlier version, and not the previous version. The file path may exist in the +previous version, effectively updating the file path with older content, or it may not, effectively adding the older +content as a new file. + +* Purging: Purging, as distinct from deletion, covers the complete removal of a file path and corresponding content from +all versions of an OCFL Object. This is a special case that is not supported as part of regular OCFL versioning +operations. An approach to implementing this is covered in a later section. + +### 3.2 Versioning +{: #versioning} + +#### 3.2.1 Version Numbering +{: #version-numbering} + +Version numbering should start with 1 and be positive sequential integers. Names start with a lower case `v`. The +numbers may be zero padded to the left to give fixed length, but, if used, zero padded numbers must always retain at +least one leftmost zero. All versions in an object must use the same version numbering layout which can be easily +determined by looking at one existing version — if the digit following `v` is a zero then the number format is zero +padded to fixed length, otherwise it is simply an integer. + +Systems with version directories have often used zero padding in order to show version order with common lexical sorting +tools (such as Unix `ls`). Zero padding is not recommended in order to avoid having to make arbitrary choices of padding +length or to place limits on the number of versions supported. + +#### 3.2.2 Version Immutability +{: #version-immutability} + +Previous versions of an object should be considered immutable since the composition of later versions of an object may +be dependent on them. In addition, the assumption of immutability ensures that copies of different versions of an object +remain consistent with each other, avoiding issues with identifying canonicity and reconciliation. + +One key consequence of this immutabilty is that manifest entries should never be deleted. New entries may be created, +and, if not deduplicating file content, additional references to copies of stored content may be added. + +### 3.3 File Purging +{: #file-purging} + +Sometimes a file needs to be deleted from all versions of an object, perhaps for legal reasons. Doing this to an OCFL +Object breaks the previous version immutability assumption and is not supported directly. The correct way to do this is +to create a new object that excludes the offending file, with a revised version history taking this into account. The +original object can then be deleted in its entirety. Creating the new object first is good practice as it avoids any +risk of data loss that may occur if an object were to be deleted before the new object is created. + +The new object need not have the same identifier as the original object. In this case, the deleted object may be +replaced by a placeholder object using the original identifier and location in the OCFL Storage Root. This is a standard +OCFL object with content that redirects users and software to the new version - possibly with an indication of why the +new object was created, if appropriate. The OCFL does not define redirect mechanisms, the interpretation of object +contents is purely a client application concern. + +### 3.4 Migrating to a New Digest Algorithm +{: #digest-migration} + +Over time new digest algorithms are developed to increase security and address vulnerabilities in existing algorithms. +It may become desirable to migrate an object to use a new algorithm while retaining [Version +Immutability](#version-immutability). OCFL supports this through the creation of a new version with a new +`digestAlgorithm` that either retains the same object content or is combined with a content update. + +Consider an example OCFL object where the `digestAlgorithm` in the `inventory.json` is `sha256` and the OCFL object +contains a single file (`file.txt`). We will illustrate migration to use `sha512` without changing the object content. +The starting `v1` file layout is: + +``` +[object root] + ├── 0=ocfl_object_1.1 + ├── inventory.json + ├── inventory.json.sha256 + └── v1 + ├── inventory.json + ├── inventory.json.sha256 + └── content + └── file.txt +``` + +and the corresponding inventory is: + +``` +{ + "digestAlgorithm": "sha256", + "head": "v1", + "id": "http://example.org/digest_update_example", + "manifest": { + "579391...bfe": [ + "v1/content/file.txt" + ] + }, + "type": "https://ocfl.io/1.1/spec/#inventory", + "versions": { + "v1": { + "created": "2020-01-01T01:01:01", + "message": "sha256 forever", + "state": { + "579391...bfe": [ + "file.txt" + ] + }, + "user": { + "address": "mailto:secret@example.org", + "name": "Secret Agent" + } + } + } +} +``` + +We create a new version, `v2`, using the `digestAlgorithm``sha512`. The `v1` directory and inventory are unchanged. The +`v2` directory has no `content` directory because no new content is added. The new inventory uses `sha512` values for +the `manifest` and `state` blocks, the legacy `sha256` digests are retained in the `fixity` block as an implementation +choice. The file layout of the object with `v2` is: + +``` +[object root] + ├── 0=ocfl_object_1.1 + ├── inventory.json + ├── inventory.json.sha512 + ├── v1 + │ ├── content + │ │ └── file.txt + │ ├── inventory.json + │ └── inventory.json.sha256 + └── v2 + ├── inventory.json + └── inventory.json.sha512 +``` + +and the corresponding `v2` inventory is: + +``` +{ + "digestAlgorithm": "sha512", + "fixity": { + "sha256": { + "579391...bfe": [ + "v1/content/file.txt" + ] + } + }, + "head": "v2", + "id": "http://example.org/digest_update_example", + "manifest": { + "7545b8720a60123...f67": [ + "v1/content/file.txt" + ] + }, + "type": "https://ocfl.io/1.1/spec/#inventory", + "versions": { + "v1": { + "created": "2020-01-01T01:01:01", + "message": "sha256 forever", + "state": { + "7545b8720a60123...f67": [ + "file.txt" + ] + }, + "user": { + "address": "mailto:secret@example.org", + "name": "Secret Agent" + } + }, + "v2": { + "created": "2020-03-26T21:00:00", + "message": "Update sha256 to sha512, no content change", + "state": { + "7545b8720a60123...f67": [ + "file.txt" + ] + }, + "user": { + "address": "mailto:special@example.org", + "name": "Special Agent" + } + } + } +} +``` + +### 3.5 Log Information +{: #log-information} + +There may be the need to record some actions on objects that do not result in changes to the object content. For +example, copying the object to new storage or validating fixity and finding nothing amiss. The `log` directory is the +location in an OCFL object where such events can be recorded. The OCFL does not make any assumptions about the contents +of this directory but, if it exists, then its contents will not be subject any validation processes. + +### 3.6 Forward Delta +{: #forward-delta} + +Forward delta differencing is a key, though optional, feature of the OCFL that means that parts of an OCFL object +version that are unchanged from a previous version are not stored again. This has the potential to significantly improve +storage efficiency when objects have multiple versions, whether through ongoing curatorial action or the accessioning of +updated material. + +When a new version of an OCFL Object is created from an earlier version and a client wishes to implement forward delta +differencing, then the possible file operations are handled in the following manner (with reference to the state and +manifest blocks of the OCFL object's inventory file): + +* Inheritance: Files inherited from the previous version unchanged are referenced in the `state` block of the new +version. These entries will be identical to the corresponding entries in the previous version's `state` block. No +changes to the `manifest` block are required. When a new OCFL version of an OCFL Object is created, the starting point +against which changes are made should be to copy the entire `state` block of the previous version, thus inheriting all +the files and content from the previous version. + +* Addition: Newly added files appear as new entries in the `state` block of the new version. The file should be stored +and an entry for the new content must be made in the `manifest` block of the object's inventory. The new digest from the +`manifest` block can then be used to create the new `state` block entry. If the file content, as determined by its +digest, corresponds to an existing `manifest` entry then, technically, this is a reinstatement operation rather than +addition and should be flagged to prevent the operation being recorded incorrectly in preservation logs. + +* Updating: Files updated from the previous version appear as changed entries in the `state` block of the new version - +with new digests associated with content paths. The updated file should be stored and a new entry for the updated +content must be made in the `manifest` block of the object's inventory. The new digest can then be used to replace the +digest for the old content in the relevant `state` block entry. If the file content, as determined by its digest, +corresponds to an existing `manifest` entry then, technically, this is a reinstatement operation rather than updating +and should be flagged to prevent the operation being recorded incorrectly in preservation logs. + +* Renaming: Files renamed from the previous version appear as changed entries in the `state` block of the new version - +with existing digests associated with new file paths. No changes to the `manifest` block are required. + +* Deletion: Files deleted from the previous version are simply removed from the `state` block of the new version. + +* Reinstatement: Since reinstated content already exists in earlier versions of the OCFL Object, no changes to the +`manifest` block are required. Reinstated entries in the `state` block should replace any entries with the same path +inherited from the previous version. If the file paths are unchanged then these entries will be identical to the +corresponding entries in the earlier version's `state` block. + +### 3.7 An Example Approach to Updating OCFL Object Versions +{: #an-example-approach-to-updating-ocfl-object-versions} + +The OCFL is designed to be a specification that covers objects at rest and consequently does not specify in detail +update and file locking mechanisms since these are implementation dependent features. Nevertheless, this section +includes a simple example of a way to update OCFL Objects in a manner that tries to ensure that updates are as +transactional as possible, and that failures are detectable and recoverable. Objects that are being updated are, of +course, not expected to be valid OCFL objects until the update operation is completed. Creating a new OCFL Object only +differs from updating in that it involves creating a version directory before version update logic takes over. + +#### 3.7.1 Segregating Objects-in-flight +{: #segregating-objects-in-flight} + +While an OCFL Object is being created or updated, the files and folders being written should be assembled in a location +that is ignored by OCFL parsers and validators. When an OCFL Version is transferred to the OCFL Object Root, the top +level inventory should be updated as the final operation. This ensures that all the files referenced by the inventory +are valid and, consequently, read-only clients that reference the inventory should continue to operate normally except +for the brief moment when the inventory is actually being updated. In practice, it is expected that upstream caching +should be able to cover this momentary unavailability in the majority of cases. This example approach defines the +following: + +* Workspace: Any OCFL Client should have a place to assemble versions before they are transferred to the Storage Root to +make the transfer operation easier to implement in a controllable and recoverable manner. Ideally it should be within +the same filesystem/namespace to make file transfers as atomic as possible, comprising filesystem metadata changes +rather than file copy operations. The Workspace is a directory that contains all such objects-in-flight in a similar +layout to an OCFL Storage Root. + +* Object Workspace Directory: In this example, for simplicity, each OCFL Object being created or updated has a path in +the Workspace that is the same as its eventual storage path in the OCFL Storage Root. This means that it is easy for +clients to determine if an object is being updated and also provides a mechanism to implement objects with mutable +current versions without violating OCFL assumptions. When updating an OCFL Object, it is useful to make a copy of the +inventory of most recent stable version of the object in this directory when it is created at the start of the update +operation. + +* Version Assembly Directory: In this case, one version assembly directory should exist per OCFL Object at any time and +updates to it should be managed by a single controlling process to avoid conflicts. If updates to multiple objects, or +from multiple sources, need to be supported, then it should be implemented upstream of version assembly in order to +allow additional locking and conflict resolution. It is useful to have a unique transaction identifier for each object +being updated to simplify any overlying process control logic (For example, it could be used as an eTag for a REST API +implementation). This transaction identifier can be used to generate a name for the version assembly directory, which +sits within the object workspace directory. + +* Temporary Inventory Files: A top level object inventory file should never be updated in situ. Instead, a new inventory +should be created in a version assembly directory and updated as the version contents are modified. This provides some +level of forensic recovery information in the event of failure during version creation. When the construction of the new +version and its inventory is complete, it can then be copied to a temporary file name alongside the top level inventory. +In this case, temporary inventory files always occur in OCFL Object Roots and are named `tmp_inventory.json`, for +argument's sake. The presence of this file is not valid OCFL and indicative of an update failure. + +#### 3.7.2 Operational Logic +{: #operational-logic} + +Bearing in mind the definitions specified above, it is possible to sketch out how they can be used to ensure some level +of integrity during OCFL updates. + +##### 3.7.2.1 Initiating a New Object Version +{: #initiating-a-new-object-version} + +1. Generate a new transaction ID + +2. Does a version assembly directory (in the requisite object workspace directory) for the object in question exist? If +yes, then abort with error - there is a transaction in process. Do not start a new transaction without rereading the +updated object. + +3. Create a version assembly directory and save the transaction ID - in this case, using it as the version assembly +directory name. + +4. Copy inventory from object into version assembly directory as a starting point for the new version + +##### 3.7.2.2 Updating a New Object Version +{: #updating-a-new-object-version} + +1. Get current transaction ID + +2. Does a version assembly directory for the object and transaction ID in question exist? If no, then abort with error - +the transaction ID is invalid for some reason, need to debug! + +3. Update object as described earlier. + +##### 3.7.2.3 Finalizing a New Object Version +{: #finalizing-a-new-object-version} + +1. Get current transaction ID + +2. Does a version assembly directory for the object and transaction ID in question exist? If no, then abort with error - +the transaction ID is invalid for some reason, need to debug! + +3. Generate inventory checksum file in version assembly directory. This effectively locks and finalizes the inventory. + +4. Move/rename version assembly directory to valid OCFL version directory name in OCFL Object Root in the OCFL Storage +Root. At this point all of the new version content is in place in the OCFL Object but the top level inventory still +references the previous version. This is assumed to be an atomic operation. Otherwise, copy the version assembly +directory, keeping its name then rename to a valid version directory. + +5. Copy inventory from new version to top level temporary inventory file. + +6. Update the inventory by deleting the old one and renaming the temporary one. This should not take very long and is +the only time when read-only clients cannot access the object because the inventory is not valid. + +7. Update the inventory checksum by deleting the old one and copying the one from the new version. + +8. Delete the transaction ID + +9. Clean up the Workspace to remove stale Object and version assembly directories + +##### 3.7.2.4 Clean up after failure +{: #clean-up-after-failure} + +1. Delete version assembly directories and temporary inventory files - this automatically reverts objects to last known +good version. Assembly directories anywhere in an OCFL Storage Root are a result of failed copies. + +2. If the inventory checksum fails then the inventory is corrupted by a failed copy. This should be recoverable from the +most recent version directory (which is the newly created one). + +3. Any overlying transactional store will need cleanup but basically repeat 5-9 above after validating all object +checksums - considering some sort of failure has just occurred. + +## 4. References +{: #references} + +### 4.1 Informative References +{: #informative-references} + +**\[NAMASTE]** Directory Description with Namaste Tags. J. Kunze.9 November 2009. URL: + + +**\[OAIS]** Reference Model for an Open Archival Information System (OAIS), Issue 2. June 2012. +URL: + +**\[OCFL-Specification]** OCFL Specification v1.1. URL: + +**\[PairTree]** Pairtrees for Object Storage. J. Kunze; M. Haye; E. Hetzner; M. Reyes; C. +Snavely. 12 August 2008\. URL: diff --git a/1.1.0/spec/change-log.md b/1.1.0/spec/change-log.md new file mode 100644 index 0000000..d4d159f --- /dev/null +++ b/1.1.0/spec/change-log.md @@ -0,0 +1,108 @@ +--- +no_site_title: true +--- +OCFL Hand-drive logo +# Oxford Common File Layout Specification v1.1 Change Log +{:.no_toc} + +7 October 2022 + +**Editors:** + +* [Neil Jefferies](https://orcid.org/0000-0003-3311-3741), [Bodleian Libraries, University of Oxford](http://www.bodleian.ox.ac.uk/) +* [Rosalyn Metz](https://orcid.org/0000-0003-3526-2230), [Emory University](https://web.library.emory.edu/) +* [Julian Morley](https://orcid.org/0000-0003-4176-1933), [Stanford University](https://library.stanford.edu/) +* [Simeon Warner](https://orcid.org/0000-0002-7970-7855), [Cornell University](https://www.library.cornell.edu/) +* [Andrew Woods](https://orcid.org/0000-0002-8318-4225), [Harvard University](https://library.harvard.edu/) + +This document is licensed under a [Creative Commons Attribution 4.0 +License](https://creativecommons.org/licenses/by/4.0/). [OCFL logo: +"hand-drive"](https://avatars0.githubusercontent.com/u/35607965) by +[Patrick Hochstenbach](http://orcid.org/0000-0001-8390-6171) is +licensed under [CC BY 2.0](https://creativecommons.org/licenses/by/2.0/). + +## Changes from OCFL v1.0 to v1.1 + +[Version 1.1 of the OCFL Specification](https://ocfl.io/1.1/spec/) is a [minor version](https://semver.org/) update to the [OCFL Specification v1.0](https://ocfl.io/1.0/spec/). The focus is correction and clarification, plus the addition of backwards compatible rules for the specification conformance of prior object versions. + +### Additions in v1.1 + +#### Add requirements to specification version number sequence + +Added [Conformance of prior versions](https://ocfl.io/1.1/spec/#conformance-of-prior-versions) section to clarify that existing version directories in an object are immutable and that the specification version number sequence must be monotonic. Adds error code [E103](https://ocfl.io/1.1/spec/#E103). (Issue [#544](https://github.com/OCFL/spec/issues/544)) + +### Clarifications in v1.1 + +#### One conformance declaration per object and storage root + +Update the [Object Conformance Declaration](https://ocfl.io/1.1/spec/#object-conformance-declaration) and [Root Conformance Declaration](https://ocfl.io/1.1/spec/#root-conformance-declaration) sections to clarify that there must be exactly one version declaration file. Error codes [E003](https://ocfl.io/1.1/spec/#003) and [E076](https://ocfl.io/1.1/spec/#E076) correspondingly updated. (Issue [#581](https://github.com/OCFL/spec/issues/581)) + +#### Inventory uses UTF-8 encoding + +In [Inventory](https://ocfl.io/1.1/spec/#inventory) section, clarify that UTF-8 encoded JSON must be used for the `inventory.json` files. (Issue [#514](https://github.com/OCFL/spec/issues/514)) + +#### Version naming convention + +Update wording in [Version Directories](https://ocfl.io/1.1/spec/#version-directories) section to talk about version consistency for all versions of an object-at-rest, rather than in terms of the process for adding a version. (Issue [#541](https://github.com/OCFL/spec/issues/541)) + +#### Clarify manifest block requirements + +Add language in [Manifest](https://ocfl.io/1.1/spec/#manifest) section to clarify that the `manifest` block must be a JSON object (adding error code [E106](https://ocfl.io/1.1/spec/#E106)) and that the each key must correspond to a digest value key found in one or more `state` blocks (adding error code [E107](https://ocfl.io/1.1/spec/#E107)). (Issue [#537](https://github.com/OCFL/spec/issues/537)) + +#### Clarify manifest requirements in historic inventories + +Wording of the [Content Directory](https://ocfl.io/1.1/spec/#content-directory) section improved to make it clear that for each historical inventory, the manifest must reference every file in that version directory. (Issue [#538](https://github.com/OCFL/spec/issues/538)) + +#### Clarify language and error codes for version numbers + +Change [Version Directories](https://ocfl.io/1.1/spec/#version-directories) section to be more specific about version numbers. Adds error code [E104](https://ocfl.io/1.1/spec/#E104) for the specific case of missing prefix `v`, and [E105](https://ocfl.io/1.1/spec/#E105) for the specific case of using positive base-ten integers. (Issue [#532](https://github.com/OCFL/spec/issues/532)) + +#### Clarify that the content directory must be a direct child of the version directory + +Change [Content Directory](https://ocfl.io/1.1/spec/#content-directory) section to make it clear that the `contentDirectory` must indicate a direct child of the version directory. Adds error code [E108](https://ocfl.io/1.1/spec/#E108). (Issue [#530](https://github.com/OCFL/spec/issues/530)) + +#### Clarify that `id` must be the same across all versions + +Update [Basic Structure](https://ocfl.io/1.1/spec/#inventory-structure) section to make it clear that the `id` must not change between versions of the same object. Adds error code [E110](https://ocfl.io/1.1/spec/#E110). (Issue [#542](https://github.com/OCFL/spec/issues/542)) + +#### Use logical state consistently + +Use the notion of "logical state" consistently in the [Version](https://ocfl.io/1.1/spec/#version), [Version Inventory and Inventory Digest](https://ocfl.io/1.1/spec/#version-inventory) and [BagIt in an OCFL Object](https://ocfl.io/1.1/spec/#example-bagit-in-ocfl) sections. (Issue [#571](https://github.com/OCFL/spec/issues/571)) + +#### Clarify digest value case sensitivity requirements + +Change [Manifest](https://ocfl.io/1.1/spec/#manifest) and [Fixity](https://ocfl.io/1.1/spec/#fixity) sections to make it clear that the additional requirement for each digest value to appear only once in the manifest or fixity block applies only to case-insensitive digest algorithms. (Issue [#573](https://github.com/OCFL/spec/issues/573)) + +#### Clarify that fixity value must be a JSON object + +Change [Fixity](https://ocfl.io/1.1/spec/#fixity) section to specify that the value of the `fixity` key must be a JSON object. An empty object (`{}`) is allowed, but a JSON `null` value is not. Added error code [E111](https://ocfl.io/1.1/spec/#E111) and made [E055](https://ocfl.io/1.1/spec/#E055) more specific. (Issue [E558](https://github.com/OCFL/spec/issues/558)) + +#### Clarify use of registered and local extensions + +Change [Object Extensions](https://ocfl.io/1.1/spec/#object-extensions) and [Storage Root Extensions](https://ocfl.io/1.1/spec/#storage-root-extensions) to define registered extensions in terms of the [OCFL Extensions Repository](https://ocfl.github.io/extensions/). Added [Documenting Local Extensions](https://ocfl.io/1.1/spec/#documenting-local-extensions) section to describe local extensions. Adds error codes [E112](https://ocfl.io/1.1/spec/#E112) and [E113](https://ocfl.io/1.1/spec/#E113), updates error code [E067](https://ocfl.io/1.1/spec/#E067), and removes error codes `E068` and `E086` which were not being used within the community. Adds warning code [W016](https://ocfl.io/1.1/spec/#W016). (Issues [#557](https://github.com/OCFL/spec/issues/557), [#565](https://github.com/OCFL/spec/issues/565)) + +#### Improve guidance on inclusion of specification in storage root + +With the change from ReSpec to Markdown as the source format for the OCFL Specification it is now easy to store a complete copy of the specification in a storage root. This version suggests using the filename `ocfl_1.1.md` for a copy of the human-readable Markdown specification in the [Root Structure](https://ocfl.io/1.1/spec/#root-structure) section. (Issues [#505](https://github.com/OCFL/spec/issues/505), [#554](https://github.com/OCFL/spec/issues/554)) + +#### Fix examples to match the specification + +Correct several examples that in the 1.0 specification did not fully comply with the specification. (Issue [#539](https://github.com/OCFL/spec/issues/539)) + +#### Reference RFC version of Bagit specification + +Update the reference to the Bagit specification from the draft to [RFC8493](https://datatracker.ietf.org/doc/html/rfc8493). (Issue [#571](https://github.com/OCFL/spec/issues/571)) + +### Corrections to validation codes + +#### Per-version validation codes + +Even for minor releases the validations codes may be updated. We have thus moved the `validation-codes.md` file into each version directory so that will be versioned along with the specification. The version of this file for the v1.1 specification is rendered as . (Issue [#553](https://github.com/OCFL/spec/issues/553)) + +#### Fix E048 description + +The E048 error description in `validation-codes.md` is corrected to remove mention of `message` and `user` because they are optional. (Issue [#531](https://github.com/OCFL/spec/issues/531)) + +#### Fix E070 description + +The E070 error description in `validation-codes.md` is corrected to refer to `extension` rather than `key` (which was left from an earlier draft). (Issue [#573](https://github.com/OCFL/spec/issues/573)) diff --git a/1.1.0/spec/index.md b/1.1.0/spec/index.md new file mode 100644 index 0000000..00cdcb3 --- /dev/null +++ b/1.1.0/spec/index.md @@ -0,0 +1,1375 @@ +--- +no_site_title: true +--- +OCFL Hand-drive logo +# Oxford Common File Layout Specification +{:.no_toc} + +7 October 2022 + +**This Version:** +* + +**Latest Published Version:** +* + +**Editors:** + +* [Neil Jefferies](https://orcid.org/0000-0003-3311-3741), [Bodleian Libraries, University of Oxford](http://www.bodleian.ox.ac.uk/) +* [Rosalyn Metz](https://orcid.org/0000-0003-3526-2230), [Emory University](https://web.library.emory.edu/) +* [Julian Morley](https://orcid.org/0000-0003-4176-1933), [Stanford University](https://library.stanford.edu/) +* [Simeon Warner](https://orcid.org/0000-0002-7970-7855), [Cornell University](https://www.library.cornell.edu/) +* [Andrew Woods](https://orcid.org/0000-0002-8318-4225), [Harvard University](https://library.harvard.edu/) + +**Former Editors:** + +* [Andrew Hankinson](https://orcid.org/0000-0003-2663-0003) + +**Additional Documents:** + +* [Implementation Notes](https://ocfl.io/1.1/implementation-notes/) +* [Specification Change Log](https://ocfl.io/1.1/spec/change-log.html) +* [Validation Codes](https://ocfl.io/1.1/spec/validation-codes.html) +* [Extensions](https://github.com/OCFL/extensions/) + +**Previous Version:** +* + +**Repository:** +* [Github](https://github.com/ocfl/spec) +* [Issues](https://github.com/ocfl/spec/issues) +* [Commits](https://github.com/ocfl/spec/commits) +* [Use Cases](https://github.com/ocfl/Use-Cases) + +This document is licensed under a [Creative Commons Attribution 4.0 +License](https://creativecommons.org/licenses/by/4.0/). [OCFL logo: +"hand-drive"](https://avatars0.githubusercontent.com/u/35607965) by +[Patrick Hochstenbach](http://orcid.org/0000-0001-8390-6171) is +licensed under [CC BY 2.0](https://creativecommons.org/licenses/by/2.0/). + +## Introduction +{:.no_toc #abstract} + +_This section is non-normative._ + +This Oxford Common File Layout (OCFL) specification describes an application-independent approach to the storage of +digital objects in a structured, transparent, and predictable manner. It is designed to promote long-term access and +management of digital objects within digital repositories. + +### Need +{:.no_toc #need} + +The OCFL initiative began as a discussion amongst digital repository practitioners to identify well-defined, common, and +application-independent file management for a digital repository's persisted objects and represents a specification of +the community’s collective recommendations addressing five primary requirements: completeness, parsability, versioning, +robustness, and storage diversity. + +#### Completeness +{:.no_toc #completeness} + +The OCFL recommends storing metadata and the content it describes together so the OCFL object can be fully understood in +the absence of original software. The OCFL does not make recommendations about what constitutes an object, nor does it +assume what type of metadata is needed to fully understand the object, recognizing those decisions may differ from one +repository to another. However, it is recommended that when making this decision, implementers consider what is +necessary to rebuild the objects from the files stored. + +#### Parsability +{:.no_toc #parsability} + +One goal of the OCFL is to ensure objects remain fixed over time. This can be difficult as software and infrastructure +change, and content is migrated. To combat this challenge, the OCFL ensures that both humans and machines can understand +the layout and corresponding inventory regardless of the software or infrastructure used. This allows for humans to read +the layout and corresponding inventory, and understand it without the use of machines. Additionally, if existing +software were to become obsolete, the OCFL could easily be understood by a light weight application, even without the +full feature repository that might have been used in the past. + +#### Versioning +{:.no_toc #versioning} + +Another need expressed by the community was the need to update and change objects, either the content itself or the +metadata associated with the object. The OCFL relies heavily on the prior art in the \[[Moab](#ref-moab)\] Design for +Digital Object Versioning which utilizes forward deltas to track the history of the object. Utilizing this schema allows +implementers of the OCFL to easily recreate past versions of an OCFL object. Like with objects, the OCFL remains silent +on when versioning should occur recognizing this may differ from implementation to implementation. + +#### Robustness +{:.no_toc #robustness} + +The OCFL also fills the need for robustness against errors, corruption, and migration. The versioning schema ensures an +OCFL object is robust enough to allow for the discovery of human errors. The fixity checking built into the OCFL via +content addressable storage allows implementers to identify file corruption that might happen outside of normal human +interactions. The OCFL eases content migrations by providing a technology agnostic method for verifying OCFL objects +have remained fixed. + +#### Storage diversity +{:.no_toc #storage-diversity} + +Finally, the community expressed a need to store content on a wide variety of storage technologies. With that in mind, +the OCFL was written with an eye toward various storage infrastructures including cloud object stores. + +### Note +{:.no_toc #note} + +This normative specification describes the nature of an OCFL Object (the "object-at-rest") and the arrangement of OCFL +Objects under an OCFL Storage Root. A set of recommendations for how OCFL Objects should be acted upon (the +"object-in-motion") can be found in the \[[OCFL-Implementation-Notes](#ref-ocfl-implementation-notes)\]. The OCFL +editorial group recommends reading both the specification and the implementation notes in order to understand the full +scope of the OCFL. + +This specification is designed to operate on storage systems that employ a hierarchical metaphor for presenting data to +users. On traditional disk-based storage this may take the form of files and directories, and this is the terminology we +use in this specification since it is widely known. However, it may equally apply to object stores, where namespaces, +containers, and objects present a similar organization hierarchy to users. + +## Table of Contents +{:.no_toc #table-of-contents} + +* TOC placeholder (required by kramdown) +{:toc} + +## 1. Conformance +{: #conformance} + +As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this +specification are non-normative. Everything else in this specification is normative. + +The key words MAY, MUST, MUST +NOT, SHOULD, and SHOULD NOT are to be interpreted as +described in \[[RFC2119](#ref-rfc2119)\]. + +## 2. Terminology +{: #terminology} + +* **Content Path:** The file path of a file on disk or in an object store, relative to the +[OCFL Object Root](#dfn-ocfl-object-root). Content paths are used in the [Manifest](#dfn-manifest) within an +[Inventory](#dfn-inventory). + +* **Digest:** An algorithmic characterization of the contents of a file conforming to a standard +digest algorithm. + +* **Extension:** Extensions are used to collaborate, review, and publish additional +non-normative functions related to OCFL. Extensions are intended to be informational and cite-able, but outside the +scope of the normal specification process. Registered extensions may be found in the [OCFL Extensions +repository.](https://ocfl.github.io/extensions/) + +* **Inventory:** A file, expressed in JSON, that tracks the history and current state of an +OCFL Object. + +* **Logical Path:** A path that represents a file's location in the [logical +state](#dfn-logical-state) of an object. Logical paths are used in conjunction with a digest to represent the file name +and path for a given bitstream at a given version. + +* **Logical State:** A grouping of logical paths tied to their corresponding bitstreams +that reflect the state of the object content for a given version. + +* **Logs Directory:** A directory for storing information about the content (e.g., actions +performed) that is not part of the content itself. + +* **Manifest:** A section of the [Inventory](#dfn-inventory) listing all files and their digests +within an OCFL Object. + +* **OCFL Object:** A group of one or more content files and administrative information, that +together have a unique identifier. The object may contain a sequence of versions of the files that represent the +evolution of the object's contents. + +* **OCFL Object Root:** The base directory of an [OCFL Object](#dfn-ocfl-object), +identified by a \[[NAMASTE](#ref-namaste)\] file "0=ocfl_object_1.1". + +* **OCFL Storage Root:** A base directory used to store OCFL Objects, identified by a +\[[NAMASTE](#ref-namaste)\] file "0=ocfl_1.1". + +* **OCFL Version:** The state of an [OCFL Object](#dfn-ocfl-object)'s content which is +constructed using the incremental changes recorded in the sequence of corresponding and prior version directories. + +* **Registered Extension Name:** The registered name of an extension is the +name provided in the _Extension Name_ property of the extension's definition in the [OCFL Extensions +repository](https://ocfl.github.io/extensions/). + +## 3. OCFL Object +{: #object-spec} + +An OCFL Object is a group of one or more content files and administrative information, that are together identified by a +URI. The object may contain a sequence of versions of the files that represent the evolution of the object's contents. + +A file is defined as a content bitstream that can be stored and transmitted. Directories (also called "folders") allow +for the organization of files into tree-like hierarchies. The content of an OCFL Object is the files and the directories +they are organized in that are stored _within_ the hierarchy layout described in this specification. + +An OCFL Object includes administrative information that identifies a directory as an OCFL Object, and also provides a +means of tracking changes to the contents of the object over time. + +An OCFL Object is therefore: + +1. A conceptual gathering of all files (data and metadata), the directories they are organized in, and their changes +over time which together form the digital representation of an entity that need to be managed, in preservation terms, as +a single coherent whole (i.e., content); and + +2. A file and directory layout and administrative information on a storage medium that provides a defined structure for +the storage of this content, and through which these files and their changes may be understood (i.e., structure). + +A key goal of the OCFL is the rebuildability of a repository from an OCFL Storage Root without additional information +resources. Consequently, a key implementation consideration should be to ensure that OCFL Objects contain all the data +and metadata required to achieve this. With reference to the \[[OAIS](#ref-oais)\] model, this would include all the +descriptive, administrative, structural, representation and preservation metadata relevant to the object. + +A central feature of the OCFL specification is support for versioning. This recognizes that digital objects will change +over time, through new requirements, fixes, updates, or format shifts. The specification takes no position on what +constitutes a version or a versionable action, but it is recommended that implementers have a clear position on this +within their local storage policies. + +### 3.1 Object Structure +{: #object-structure} + +The OCFL Object structure organizes content files and administrative information in order to support content storage and +object validation. The structure for an object with one version is shown in the following figure: + +``` +[object_root] + ├── 0=ocfl_object_1.1 + ├── inventory.json + ├── inventory.json.sha512 + └── v1 + ├── inventory.json + ├── inventory.json.sha512 + └── content + └── ... content files ... +``` + +The [OCFL Object Root](#dfn-ocfl-object-root) MUST NOT contain files or +directories other than those specified in the following sections. + +### 3.2 Object Conformance Declaration +{: #object-conformance-declaration} + +The OCFL specification version declaration MUST be formatted according to the +\[[NAMASTE](#ref-namaste)\] specification. There MUST be exactly one version +declaration file in the base directory of the [OCFL Object Root](#dfn-ocfl-object-root) giving the OCFL version in the +filename. The filename MUST conform to the pattern `T=dvalue`, where `T` MUST be 0, and `dvalue` MUST be `ocfl_object_`, +followed by the OCFL specification version number. The text contents of the file MUST be the same as `dvalue`, followed by a newline (`\n`). + +### 3.3 Version Directories +{: #version-directories} + +OCFL Object content MUST be stored as a sequence of one or more versions. Each +object version is stored in a version directory under the object root. Version directory names MUST be constructed by prepending `v` to the version number. The version number MUST be taken from the sequence of positive, base-ten integers: 1, 2, 3, etc.. The version number +sequence MUST start at 1 and MUST be +continuous without missing integers. + +Implementations SHOULD use version directory names constructed without +zero-padding the version number, ie. `v1`, `v2`, `v3`, etc.. + +For compatibility with existing filesystem conventions, implementations MAY use zero-padded +version directory numbers, with the following restriction: If zero-padded version directory numbers are used then they +MUST start with the prefix `v` and then a zero. For example, in an implementation +that uses five digits for version directory names then `v00001` to `v09999` are allowed, `v10000` is not allowed. + +The first version of an object defines the naming convention for all version directories for the object. All version +directories of an object MUST use the same naming convention: either a non-padded +version directory number, or a zero-padded version directory number of consistent length. The version naming convention +MUST be consistent across all versions. In all cases, references to files inside +version directories from inventory files MUST use the actual version directory +names. + +There MUST be no other files as children of a version directory, other than an +[inventory file](#inventory) and a [inventory digest](#inventory-digest). The version directory SHOULD NOT contain any directories other than the designated content sub-directory. Once created, +the contents of a version directory are expected to be immutable. + +#### 3.3.1 Content Directory +{: #content-directory} + +Version directories MUST contain a designated content sub-directory if the +version contains files to be preserved, and SHOULD NOT contain this sub-directory +otherwise. The name of this designated sub-directory MAY be defined in the [inventory +file](#inventory) using the key `contentDirectory` with the value being the chosen sub-directory name as a string, +relative to the version directory. The `contentDirectory` value MUST represent a +direct child directory of the version directory in which it is found. As such, the `contentDirectory` value MUST NOT contain the forward slash (`/`) path separator and MUST NOT be either one or two periods (`.` or `..`). If the key `contentDirectory` is set, it +MUST be set in the first version of the object and MUST NOT change between versions of the same object. + +If the key `contentDirectory` is not present in the [inventory file](#inventory) then the name of the designated content +sub-directory MUST be `content`. OCFL-compliant tools (including any validators) +MUST ignore all directories in the object version directory except for the +designated content directory. + +Every file within a version's content directory MUST be referenced in the +[manifest](#manifest) section of that version's inventory. There MUST NOT be +empty directories within a version's content directory. A directory that would otherwise be empty MAY be maintained by creating a file within it named according to local conventions, for example +by making an empty `.keep` file. + +### 3.4 Digests +{: #digests} + +A [digest](#dfn-digest) plays two roles in an OCFL Object. The first is that digests allow for content-addressable +reference to files within the OCFL Object. That is, the connection between a file's [content path](#dfn-content-path) on +physical storage and its [logical path](#dfn-logical-path) in a version of the object's content is made with a digest of +its contents, rather than its filename. This use of the content digest facilitates de-duplication of files with the same +content within an object, such as files that are unchanged from one version to the next. The second role that digests +play is provide for fixity checks to determine whether a file has become corrupt, through hardware degradation or +accident for example. + +For content-addressing, OCFL Objects MUST use either `sha512` or `sha256`, and +SHOULD use `sha512`. The choice of the `sha512` digest algorithm as default +recognizes that it has no known collision vulnerabilities and multiple implementations are available. + +For storage of additional fixity values, or to support legacy content migration, implementers MUST choose from the following controlled vocabulary of digest algorithms, or from a list of +additional algorithms given in the \[[Digest-Algorithms-Extension](#ref-digest-algorithms-extension)\]. OCFL clients +MUST support all fixity algorithms given in the table below, and MAY support additional algorithms from the extensions. Optional fixity algorithms that are not +supported by a client MUST be ignored by that client. + +| Digest Algorithm Name | Note | +| --- | --- | +| `md5` | Insecure. Use only for legacy fixity values. MD5 algorithm and hex encoding defined by \[[RFC1321](#ref-rfc1321)\]. For example, the `md5` digest of a zero-length bitstream is `d41d8cd98f00b204e9800998ecf8427e`. | +| `sha1` | Insecure. Use only for legacy fixity values. SHA-1 algorithm defined by \[[FIPS-180-4](#ref-fips-180-4)\] and MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. For example, the `sha1` digest of a zero-length bitstream is `da39a3ee5e6b4b0d3255bfef95601890afd80709`. | +| `sha256` | Non-truncated form only; note performance implications. SHA-256 algorithm defined by \[[FIPS-180-4](#ref-fips-180-4)\] and MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. For example, the `sha256` digest of a zero-length bitstream starts `e3b0c44298fc1c149afbf4c8996fb92427ae41e4...` (64 hex digits long). | +| `sha512` | Default choice. Non-truncated form only. SHA-512 algorithm defined by \[[FIPS-180-4](#ref-fips-180-4)\] and MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. For example, the `sha512` digest of a zero-length bitstream starts `cf83e1357eefb8bdf1542850d66d8007d620e405...` (128 hex digits long). | +| `blake2b-512` | Full-length form only, using the 2B variant (64 bit) as defined by \[[RFC7693](#ref-rfc7693)\]. MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. For example, the `blake2b-512` digest of a zero-length bitstream starts `786a02f742015903c6c6fd852552d272912f4740...` (128 hex digits long). | + +An OCFL Inventory MAY contain a fixity section that can store one or more blocks containing +fixity values using multiple digest algorithms. See the [section on fixity](#fixity) below for further details. + +> Non-normative note: Implementers may also store copies of their file digests in a system external to their OCFL Object +stores at the point of ingest, to further safeguard against the possibility of malicious manipulation of file contents +and digests. +> +> Implementers should be aware that base16 digests are case insensitive. Different tools will generate digests in +uppercase or lowercase, and this may lead to case differences between references to a digest and the digest itself +within the inventory. If string-based methods are used to work with digests and inventories (as is the case in most +common JSON libraries) then extra care must be taken to ensure case-insensitive comparisons are being made. + +### 3.5 Inventory +{: #inventory} + +An OCFL Object Inventory MUST follow the JSON (defined by +\[[RFC8259](#ref-rfc8259)\]) structure described in this section with contents encoded in UTF-8, and MUST be named `inventory.json`. The order of entries in both the JSON objects and arrays used in +inventory files has no significance. An OCFL Object Inventory MUST NOT contain +any keys not described in this specification. + +The forward slash (/) path separator MUST be used in content paths in the +[manifest](#manifest) and [fixity](#fixity) blocks within the inventory. Implementations that target systems using other +separators will need to translate paths appropriately. + +> Non-normative note: A \[[JSON-Schema](#ref-json-schema)\] for validating OCFL Object Inventory files is provided at +[inventory_schema.json](inventory_schema.json). + +#### 3.5.1 Basic Structure +{: #inventory-structure} + +Every OCFL inventory MUST include the following keys: + +* `id`: A unique identifier for the OCFL Object. This MUST be unique in the local +context, MUST NOT change between versions of the same object, and SHOULD be a URI \[[RFC3986](#ref-rfc3986)\]. There is no expectation that a URI used is +resolvable. For example, URNs \[[RFC8141](#ref-rfc8141)\] MAY be used. + +* `type`: A type for the inventory JSON object that also serves to document the OCFL specification version that the +inventory complies with. In the object root inventory this MUST be the URI of the +inventory section of the specification version matching the object conformance declaration. For the current +specification version the value is `https://ocfl.io/1.1/spec/#inventory`. + +* `digestAlgorithm`: The digest algorithm used for calculating digests for content-addressing within the OCFL Object and +for the [Inventory Digest](#inventory-digest). This MUST be the algorithm used in +the `manifest` and `state` blocks, see the [section on Digests](#digests) for more information about algorithms. + +* `head`: The version directory name of the most recent version of the object. This MUST be the version directory name with the highest version number. + +There MAY be the following key: + +* `contentDirectory`: The name of the designated content directory within the version directories. If not specified then +the content directory name is `content`. + +In addition to these keys, there MUST be two other blocks present, `manifest` and +`versions`, which are discussed in the next two sections. + +#### 3.5.2 Manifest +{: #manifest} + +The value of the `manifest` key MUST be a JSON object, and each key MUST correspond to a digest value key found in one or more `state` blocks of the +current and/or previous `version` blocks of the [OCFL Object](#dfn-ocfl-object). The value for each key MUST be an array containing the [content path](#dfn-content-path)s of files in the OCFL Object +that have content with the given digest. As JSON keys are case sensitive, for digest algorithms with case insensitive +digest values, there is an additional requirement that each digest value MUST +occur only once in the manifest block for any digest algorithm, regardless of case. Content paths within a manifest +block MUST be relative to the [OCFL Object Root](#dfn-ocfl-object-root). The +following restrictions avoid ambiguity and provide path safety for clients processing the `manifest`. + +* The content path MUST be interpreted as a set of one or more path elements +joined by a `/` path separator. + +* Path elements MUST NOT be `.`, `..`, or empty (`//`). + +* A content path MUST NOT begin or end with a forward slash (`/`). + +* Within an inventory, content paths MUST be unique and non-conflicting, so the +content path for a file cannot appear as the initial part of another content path. + +> Non-normative note: If only one file is stored in the OCFL Object for each digest, fully de-duplicating the content, +then there will be only one [content path](#dfn-content-path) for each digest. There may, however, be multiple logical +paths for a given digest if the content was not entirely de-duplicated when constructing the OCFL Object. +> +> An example manifest object for three content paths, all in version 1, is shown below: +> +> ```json +"manifest": { + "7dcc35...c31": [ "v1/content/foo/bar.xml" ], + "cf83e1...a3e": [ "v1/content/empty.txt" ], + "ffccf6...62e": [ "v1/content/image.tiff" ] +} +> ``` + +#### 3.5.3 Versions +{: #versions} + +An OCFL Object Inventory MUST include a block for storing versions. This block +MUST have the key of `versions` within the inventory, and it MUST be a JSON object. The keys of this object MUST +correspond to the names of the [version directories](#version-directories) used. Each value MUST be another JSON object that characterizes the version, as described in the [3.5.3.1 +Version](#version) section. + +##### 3.5.3.1 Version +{: #version} + +A JSON object to describe one [OCFL Version](#dfn-ocfl-version), which MUST +include the following keys: + +* `created`: The value of this key is the datetime of creation of this version. It MUST be expressed in the Internet Date/Time Format defined by \[[RFC3339](#ref-rfc3339)\]. This +format requires the inclusion of a timezone value or `Z` for UTC, and that the time component be granular to the second +level (with optional fractional seconds). + +* `state`: The value of this key is a JSON object, containing a list of keys and values corresponding to the [logical +state](#dfn-logical-state) of the object at that version. The keys of this JSON object are digest values, each of which +MUST exactly match a digest value key in the [manifest of the +inventory](#manifest). The value for each key is an array containing [logical path](#dfn-logical-path) names of files in +the OCFL Object's logical state that have content with the given digest. + +[Logical paths](#logical-path) present the structure of an OCFL Object at a given version. This is given as an array of +values, with the following restrictions to provide for path safety in the common case of the logical path value +representing a file path. + +* The logical path MUST be interpreted as a set of one or more path elements +joined by a `/` path separator. + +* Path elements MUST NOT be `.`, `..`, or empty (`//`). + +* A logical path MUST NOT begin or end with a forward slash (`/`). + +* Within a version, logical paths MUST be unique and non-conflicting, so the +logical path for a file cannot appear as the initial part of another logical path. + +> Non-normative note: The [logical state](#dfn-logical-state) of the object uses content-addressing to map logical paths +to their bitstreams, as expressed in the manifest section of the inventory. Notably, the version state provides +de-duplication of content within the OCFL Object by mapping multiple logical paths with the same content to the same +digest in the manifest. See \[[OCFL-Implementation-Notes](#ref-ocfl-implementation-notes)\]. +> +> An example `state` block is shown below: +> +> ```json +"state": { + "4d27c8...b53": [ "foo/bar.xml" ], + "cf83e1...a3e": [ "empty.txt", "empty2.txt" ] +} +> ``` +> +> This `state` block describes an object with 3 files, two of which have the same content (`empty.txt` and +`empty2.txt`), and one of which is in a sub-directory (`bar.xml`). The [logical state](#dfn-logical-state) shown as a +tree is thus: +> +> ``` +├── empty.txt +├── empty2.txt +└── foo + └── bar.xml +> ``` + +The JSON object describing an [OCFL Version](#dfn-ocfl-version), SHOULD include +the following keys: + +* `message`: The value of this key is freeform text, used to record the rationale for creating this version. It MUST be a JSON string. + +* `user`: The value of this key is a JSON object intended to identify the user or agent that created the current [OCFL +Version](#dfn-ocfl-version). The value of the `user` key MUST contain a user name +key, `name` and SHOULD contain an address key, `address`. The `name` value is any +readable name of the user, e.g., a proper name, user ID, agent ID. The `address` value SHOULD be a URI: either a mailto URI \[[RFC6068](#ref-rfc6068)\] with the e-mail address of the +user or a URL to a personal identifier, e.g., an ORCID iD. + +#### 3.5.4 Fixity +{: #fixity} + +An OCFL Object inventory MAY include a block for storing additional fixity information to +supplement the complete set of digests in the [Manifest](#manifest), for example to support legacy digests from a +content migration. If present, this block MUST have the key of `fixity` within +the inventory, and its value MUST be a JSON object, which MAY be empty. + +The keys within the `fixity` block MUST correspond to the controlled vocabulary +of [digest algorithm names](#digest-algorithms) listed in the [Digests](#digests) section, or in a table given in an +[Extension](#dfn-extension). The value of the fixity block for a particular digest algorithm MUST follow the structure of the [3.5.2 Manifest](#manifest) block; that is, a key corresponding +to the digest value, and an array of [content path](#dfn-content-path)s. The `fixity` block for any digest algorithm +MAY include digest values for any subset of content paths in the object. Where included, +the digest values given MUST match the digests of the files at the corresponding +content paths. As JSON keys are case sensitive, for digest algorithms with case insensitive digest values, there is an +additional requirement that each digest value MUST occur only once in the +`fixity` block for any digest algorithm, regardless of case. There is no requirement that all content files have a value +in the `fixity` block, or that fixity values provided in one version are carried forward to later versions. + +> An example `fixity` with `md5` and `sha1` digests is shown below. In this case the `md5` digest values are provided +only for version 1 content paths. +> +> ```json +"fixity": { + "md5": { + "184f84e28cbe75e050e9c25ea7f2e939": [ "v1/content/foo/bar.xml" ], + "c289c8ccd4bab6e385f5afdd89b5bda2": [ "v1/content/image.tiff" ], + "d41d8cd98f00b204e9800998ecf8427e": [ "v1/content/empty.txt" ] + }, + "sha1": { + "66709b068a2faead97113559db78ccd44712cbf2": [ "v1/content/foo/bar.xml" ], + "a6357c99ecc5752931e133227581e914968f3b9c": [ "v2/content/foo/bar.xml" ], + "b9c7ccc6154974288132b63c15db8d2750716b49": [ "v1/content/image.tiff" ], + "da39a3ee5e6b4b0d3255bfef95601890afd80709": [ "v1/content/empty.txt" ] + } +} +> ``` + +### 3.6 Inventory Digest +{: #inventory-digest} + +Every occurrence of an inventory file MUST have an accompanying sidecar file +named `inventory.json.ALGORITHM` stating its digest, where `ALGORITHM` is the chosen digest algorithm for the object. +The ALGORITHM MUST match the value given for the `digestAlgorithm` key in the +inventory. An example might be `inventory.json.sha512`. + +The digest sidecar file MUST contain the digest of the inventory file. This MUST follow the format: + +``` +DIGEST inventory.json +``` + +One or more whitespace characters (spaces or tabs) must separate DIGEST from the string `inventory.json`; that is, the +name of the inventory file in the same directory. + +The digest of the inventory MUST be computed only after all changes to the +inventory have been made, and thus writing the digest sidecar file is the last step in the versioning process. + +### 3.7 Version Inventory and Inventory Digest +{: #version-inventory} + +Every OCFL Object MUST have an inventory file within the OCFL Object Root, +corresponding to the state of the OCFL Object at the current version. Additionally, every version directory SHOULD include an inventory file that is an [Inventory](#inventory) of all content for +versions up to and including that particular version. Where an OCFL Object contains `inventory.json` in version +directories, the inventory file in the OCFL Object Root MUST be the same as the +file in the most recent version. See also requirements for the corresponding [Inventory Digest](#inventory-digest). + +In the case that prior version directories include an inventory file there will be multiple inventory files describing +prior versions within the OCFL Object. Each `version` block in each prior inventory file MUST represent the same [logical state](#dfn-logical-state) as the corresponding `version` block +in the current inventory file. Additionally, the values of the `created`, `message` and `user` keys in each `version` +block in each prior inventory file SHOULD have the same values as the +corresponding keys in the corresponding `version` block in the current inventory file. + +> Non-normative note: Storing an inventory for every version provides redundancy for this critical information in a way +that is compatible with storage strategies that have immutable version directories. + +#### 3.7.1 Conformance of prior versions +{: #conformance-of-prior-versions} + +Version directories in OCFL are intended to be immutable in that existing version directories do not change when a new +version directory is added. Each version directory within an OCFL Object MUST +conform to either the same or a later OCFL specification version as the preceding version directory. If inventories are +stored in the version directories then the OCFL specification version for a given version directory is apparent from the +`type` attribute in that [inventory](#inventory-structure). + +### 3.8 Logs Directory +{: #logs-directory} + +The base directory of an OCFL Object MAY contain a directory named `logs`, which MAY be empty. Implementers SHOULD use the [logs +directory](#dfn-logs-directory) for storing files that contain a record of actions taken on the object. Since these logs +may be subject to local standards requirements, the format of these logs is considered out-of-scope for the OCFL Object. +Clients operating on the object MAY log actions here that are not otherwise captured. + +> Non-normative note: The purpose of the logs directory is to provide implementers with a location for storing local +information about actions to the OCFL Object's content that is not part of the content itself. +> +> As an example, implementers may have different local requirements to store audit information for their content. Some +may wish to store a log entry indicating that an audit was conducted, and nothing was wrong, while others may wish to +only store a log entry if an intervention was required. + +### 3.9 Object Extensions +{: #object-extensions} + +The base directory of an OCFL Object MAY contain a directory named `extensions` for the +purposes of extending the functionality of an OCFL Object. The `extensions` directory MUST NOT contain any files or sub-directories other than extension sub-directories. +Extension sub-directories SHOULD be named according to a [registered extension +name](#dfn-registered-extension-name) in the [OCFL Extensions repository](https://ocfl.github.io/extensions/). + +> Non-normative note: Extension sub-directories should use the same name as a registered extension in order to both +avoid the possiblity of an extension sub-directory colliding with the name of another registered extension as well as to +facilitate the recognition of extensions by OCFL clients. See also [Documenting Local +Extensions](#documenting-local-extensions). + +## 4. OCFL Storage Root +{: #storage-root} + +An [OCFL Storage Root](#dfn-ocfl-storage-root) is the base directory of an OCFL storage layout. + +### 4.1 Root Structure +{: #root-structure} + +An OCFL Storage Root MUST contain a [Root Conformance +Declaration](#root-conformance-declaration) identifying it as such. + +An OCFL Storage Root MAY contain other files as direct children. These might include a +human-readable copy of the OCFL specification to make the storage root self-documenting, or files used to [document +local extensions](#documenting-local-extensions). The source file for this specification document is in +Markdown (described in \[[RFC7764](#ref-rfc7764)\], which is designed to be readable as plain text as well as for +rendering as HTML, and thus makes it suitable for self-documentation. An OCFL validator MUST ignore any files in the storage root it does not understand. + +An OCFL Storage Root MUST NOT contain directories or sub-directories other than +as a directory hierarchy used to store OCFL Objects or for [storage root extensions](#storage-root-extensions). The +directory hierarchy used to store OCFL Objects MUST NOT contain files that are +not part of an OCFL Object. Empty directories MUST NOT appear under a storage +root. + +An OCFL Storage Root MAY contain a file named `ocfl_layout.json` to describe the +arrangement of directories and OCFL objects under the storage root. If present, `ocfl_layout.json` MUST be a JSON (defined by \[[RFC8259](#ref-rfc8259)\]) document encoded in UTF-8 and include the +following two keys in the root JSON object: + +* `extension` - An extension name that identifies an arrangement of directories and OCFL objects under the storage root, +i.e. how OCFL object identifiers are mapped to directory hierarchies. The value of the `extension` key MUST be the [registered extension name](#dfn-registered-extension-name) for the extension +defining the arrangement under the storage root. + +* `description` - A human readable description of the arrangement of directories and OCFL objects under the storage +root. + +Although implementations may require multiple OCFL Storage Roots—that is, several logical or physical volumes, or +multiple "buckets" in an object store—each OCFL Storage Root MUST be independent. + +The following example OCFL Storage Root represents the minimal set of files and folders: + +``` +[storage_root] + ├── 0=ocfl_1.1 + ├── ocfl_1.1.md (human-readable text of the OCFL specification; optional) + └── ocfl_layout.json (description of storage hierarchy layout; optional) +``` + +### 4.2 Root Conformance Declaration +{: #root-conformance-declaration} + +The OCFL version declaration MUST be formatted according to the +\[[NAMASTE](#ref-namaste)\] specification. There MUST be exactly one version +declaration file in the base directory of the [OCFL Storage Root](#dfn-ocfl-storage-root) giving the OCFL version in the +filename. The filename MUST conform to the pattern `T=dvalue`, where `T` MUST be 0, and `dvalue` MUST be `ocfl_`, +followed by the OCFL specification version number. The text contents of the file MUST be the same as `dvalue`, followed by a newline (`\n`). + +Root conformance indicates that the OCFL Storage Root conforms to this section (i.e. the OCFL Storage Root section) of +the specification. OCFL Objects within the OCFL Storage Root also include a conformance declaration which MUST indicate OCFL Object conformance to the same or earlier version of the +specification. + +### 4.3 Storage Hierarchies +{: #root-hierarchies} + +[OCFL Object Root](#dfn-ocfl-object-root)s MUST be stored either as the terminal +resource at the end of a directory storage hierarchy or as direct children of a containing [OCFL Storage +Root](#dfn-ocfl-storage-root). + +A common practice is to use a unique identifier scheme to compose this storage hierarchy, typically arranged according +to some form of the \[[PairTree](#ref-pairtree)\] specification. Irrespective of the pattern chosen for the storage +hierarchies, the following restrictions apply: + +1. There MUST be a deterministic mapping from an object identifier to a unique +storage path + +2. Storage hierarchies MUST NOT include files within intermediate directories + +3. Storage hierarchies MUST be terminated by OCFL Object Roots + +4. Storage hierarchies within the same OCFL Storage Root SHOULD use just one +layout pattern + +5. Storage hierarchies within the same OCFL Storage Root SHOULD consistently use +either a directory hierarchy of OCFL Objects or top-level OCFL Objects + +### 4.4 Storage Root Extensions +{: #storage-root-extensions} + +The behavior of the storage root may be extended to support features from other specifications. + +The base directory of an OCFL Storage Root MAY contain a directory named `extensions` for +the purposes of extending the functionality of an OCFL Storage Root. The guidelines and limitations for the storage +root `extensions` directory are defined in alignment with those of the [object extensions](#object-extensions). + +The `extensions` directory MUST NOT contain any files or sub-directories +other than extension sub-directories. Extension sub-directories SHOULD be named +according to a registered extension name. + +> Non-normative notes: Extension sub-directories should use the same name as a registered extension in order to both +avoid the possiblity of an extension sub-directory colliding with the name of another registered extension as well as to +facilitate the recognition of extensions by OCFL clients. See also [Documenting Local +Extensions](#documenting-local-extensions). +> +> Storage extensions can be used to support additional features, such as providing the storage +hierarchy disposition when pairtree is in use, or additional human-readable text about the nature of the storage root. + +### 4.5 Documenting Local Extensions +{: #documenting-local-extensions} + +It is preferable that both [Object Extensions](#object-extensions) and [Storage Root +Extenstions](#storage-root-extensions) are documented and registered in the [OCFL Extensions +repository](https://ocfl.github.io/extensions/). However, local extensions MAY be +documented by including a plain text document directly in the storage root, thus making the storage root +self-documenting. + +### 4.6 Filesystem features +{: #filesystem-features} + +In order to maximize the compatibility of the OCFL with different filesystems, and thus improve the portability of OCFL +Objects between different systems, some restrictions on the use of certain filesystem features are necessary. If the +preservation of non-OCFL-compliant features is required then the content MUST be +wrapped in a suitable disk or filesystem image format which OCFL can treat as a regular file. + +1. Filesystem metadata (e.g. permissions, access, and creation times) are not considered portable between filesystems or +preservable through file transfer operations. These attributes also cannot be validated in terms of fixity in a +consistent manner. As such, the OCFL does not support the portability of these attributes. + +2. Hard and soft (symbolic) links are not portable and MUST NOT be used within +OCFL Storage hierarchies. A common use case for links is storage deduplication. OCFL inventories provide a portable +method of achieving the same effect by using digests to address content. + +3. File paths and filenames in the OCFL are case sensitive. Filesystems MUST +preserve the case of OCFL filepaths and filenames. + +4. Transparent filesystem features such as compression and encryption should be effectively invisible to OCFL +operations. Consequently, they should not be expected to be portable. + +## 5. Examples +{: #examples} + +_This section is non-normative._ + +### 5.1 Minimal OCFL Object +{: #example-minimal-object} + +The following example OCFL Object has content that is a single file (`file.txt`), and just one version (`v1`): + +``` +[object root] + ├── 0=ocfl_object_1.1 + ├── inventory.json + ├── inventory.json.sha512 + └── v1 + ├── inventory.json + ├── inventory.json.sha512 + └── content + └── file.txt +``` + +The inventory for this OCFL Object, the same both at the top-level and in the `v1` directory, might be: + +```json +{ + "digestAlgorithm": "sha512", + "head": "v1", + "id": "http://example.org/minimal", + "manifest": { + "7545b8...f67": [ "v1/content/file.txt" ] + }, + "type": "https://ocfl.io/1.1/spec/#inventory", + "versions": { + "v1": { + "created": "2018-10-02T12:00:00Z", + "message": "One file", + "state": { + "7545b8...f67": [ "file.txt" ] + }, + "user": { + "address": "mailto:alice@example.org", + "name": "Alice" + } + } + } +} +``` + +### 5.2 Versioned OCFL Object +{: #example-versioned-object} + +The following example OCFL Object has three versions: + +``` +[object root] + ├── 0=ocfl_object_1.1 + ├── inventory.json + ├── inventory.json.sha512 + ├── v1 + │   ├── inventory.json + │   ├── inventory.json.sha512 + │   └── content + │ ├── empty.txt + │ ├── foo + │ │   └── bar.xml + │ └── image.tiff + ├── v2 + │   ├── inventory.json + │   ├── inventory.json.sha512 + │   └── content + │ └── foo + │     └── bar.xml + └── v3 + ├── inventory.json + └── inventory.json.sha512 +``` + +In `v1` there are three files, `empty.txt`, `foo/bar.xml`, and `image.tiff`. In `v2` the content of `foo/bar.xml` is +changed, `empty2.txt` is added with the same content as `empty.txt`, and `image.tiff` is removed. In `v3` the file +`empty.txt` is removed, and `image.tiff` is reinstated. As a result of forward-delta versioning, the object tree above +shows only new content added in each version. The inventory shown below details the other changes, includes additional +fixity information using `md5` and `sha1` digest algorithms, and minimal metadata for each version. + +```json +{ + "digestAlgorithm": "sha512", + "fixity": { + "md5": { + "184f84e28cbe75e050e9c25ea7f2e939": [ "v1/content/foo/bar.xml" ], + "2673a7b11a70bc7ff960ad8127b4adeb": [ "v2/content/foo/bar.xml" ], + "c289c8ccd4bab6e385f5afdd89b5bda2": [ "v1/content/image.tiff" ], + "d41d8cd98f00b204e9800998ecf8427e": [ "v1/content/empty.txt" ] + }, + "sha1": { + "66709b068a2faead97113559db78ccd44712cbf2": [ "v1/content/foo/bar.xml" ], + "a6357c99ecc5752931e133227581e914968f3b9c": [ "v2/content/foo/bar.xml" ], + "b9c7ccc6154974288132b63c15db8d2750716b49": [ "v1/content/image.tiff" ], + "da39a3ee5e6b4b0d3255bfef95601890afd80709": [ "v1/content/empty.txt" ] + } + }, + "head": "v3", + "id": "ark:/12345/bcd987", + "manifest": { + "4d27c8...b53": [ "v2/content/foo/bar.xml" ], + "7dcc35...c31": [ "v1/content/foo/bar.xml" ], + "cf83e1...a3e": [ "v1/content/empty.txt" ], + "ffccf6...62e": [ "v1/content/image.tiff" ] + }, + "type": "https://ocfl.io/1.1/spec/#inventory", + "versions": { + "v1": { + "created": "2018-01-01T01:01:01Z", + "message": "Initial import", + "state": { + "7dcc35...c31": [ "foo/bar.xml" ], + "cf83e1...a3e": [ "empty.txt" ], + "ffccf6...62e": [ "image.tiff" ] + }, + "user": { + "address": "mailto:alice@example.com", + "name": "Alice" + } + }, + "v2": { + "created": "2018-02-02T02:02:02Z", + "message": "Fix bar.xml, remove image.tiff, add empty2.txt", + "state": { + "4d27c8...b53": [ "foo/bar.xml" ], + "cf83e1...a3e": [ "empty.txt", "empty2.txt" ] + }, + "user": { + "address": "mailto:bob@example.com", + "name": "Bob" + } + }, + "v3": { + "created": "2018-03-03T03:03:03Z", + "message": "Reinstate image.tiff, delete empty.txt", + "state": { + "4d27c8...b53": [ "foo/bar.xml" ], + "cf83e1...a3e": [ "empty2.txt" ], + "ffccf6...62e": [ "image.tiff" ] + }, + "user": { + "address": "mailto:cecilia@example.com", + "name": "Cecilia" + } + } + } +} +``` + +### 5.3 Different Logical and Content Paths in an OCFL Object +{: #example-object-diff-paths} + +The following example OCFL Object inventory shows how content paths may differ from logical paths. The example object +has just one version, `v1`, which has two files with logical paths `a file.wxy` and `another file.xyz` as shown in the +`state` block. The corresponding content paths are `v1/content/3bacb119a98a15c5` and `v1/content/9f2bab8ef869947d` +respectively, as shown in the `manifest`. Except for location within the appropriate version directory, `v1/content` in +this example, the OCFL specification does not constrain the choice of content paths used when creating or updating an +OCFL object. The choice might depend on particular limitations of, or optimizations for, the target storage system, or +on portability considerations. Any compliant implementation will be able to recover version state with the original +logical paths. + +```json +{ + "digestAlgorithm": "sha512", + "head": "v1", + "id": "http://example.org/diff-paths", + "manifest": { + "7545b8...f67": [ "v1/content/3bacb119a98a15c5" ], + "af318d...3cd": [ "v1/content/9f2bab8ef869947d" ] + }, + "type": "https://ocfl.io/1.1/spec/#inventory", + "versions": { + "v1": { + "created": "2019-03-14T20:31:00Z", + "state": { + "7545b8...f67": [ "a file.wxy" ], + "af318d...3cd": [ "another file.xyz" ] + }, + "user": { + "address": "mailto:admin@example.org", + "name": "Some Admin" + } + } + } +} +``` + +### 5.4 BagIt in an OCFL Object +{: #example-bagit-in-ocfl} + +\[[BagIt](#ref-bagit)\] is a common file packaging specification, but unlike the OCFL it does not provide a mechanism +for content versioning. Using the OCFL it is possible to store a BagIt structure with content versioning, such that when +the [logical state](#dfn-logical-state) is resolved, it creates a valid BagIt 'bag'. This example will illustrate one +way this can be accomplished, using the [example of a basic +bag](https://datatracker.ietf.org/doc/html/rfc8493#section-4.1) given in the BagIt specification. + +``` +[object root] + ├── 0=ocfl_object_1.1 + ├── inventory.json + ├── inventory.json.sha512 + └── v1 + ├── inventory.json + ├── inventory.json.sha512 + └── content + └── myfirstbag + ├── bagit.txt + ├── data + │   └── 27613-h + │   └── images + │   ├── q172.png + │   └── q172.txt + └── manifest-md5.txt +``` + +If, for example, a new directory were added in a subsequent version, the OCFL Object would look like this: + +``` +[object root] + ├── 0=ocfl_object_1.1 + ├── inventory.json + ├── inventory.json.sha512 + ├── v1 + │ ├── inventory.json + │ ├── inventory.json.sha512 + │ └── content + │   └── myfirstbag + │   ├── bagit.txt + │   ├── data + │   │   └── 27613-h + │   │   └── images + │   │   ├── q172.png + │   │   └── q172.txt + │   └── manifest-md5.txt + └── v2 + ├── inventory.json + ├── inventory.json.sha512 + └── content + └── myfirstbag + ├── data + │   └── 27614-h + │   └── images + │   ├── q173.png + │   └── q173.txt + └── manifest-md5.txt +``` + +The state of the object at version 2 would be the following BagIt object: + +``` +myfirstbag + ├── bagit.txt + ├── data + │   ├── 27613-h + │   │   └── images + │   │   ├── q172.png + │   │   └── q172.txt + │   └── 27614-h + │   └── images + │   ├── q173.png + │   └── q173.txt + └── manifest-md5.txt +``` + +The OCFL Inventory for this object would be as follows: + +```json +{ + "digestAlgorithm": "sha512", + "head": "v2", + "id": "urn:uri:example.com/myfirstbag", + "manifest": { + "cf83e1...a3e": [ "v1/content/myfirstbag/bagit.txt" ], + "f15428...83f": [ "v1/content/myfirstbag/manifest-md5.txt" ], + "85f2b0...007": [ "v1/content/myfirstbag/data/27613-h/images/q172.png" ], + "d66d80...8bd": [ "v1/content/myfirstbag/data/27613-h/images/q172.txt" ], + "2b0ff8...620": [ "v2/content/myfirstbag/manifest-md5.txt" ], + "921d36...877": [ "v2/content/myfirstbag/data/27614-h/images/q173.png" ], + "b8bdf1...927": [ "v2/content/myfirstbag/data/27614-h/images/q173.txt" ] + }, + "type": "https://ocfl.io/1.1/spec/#inventory", + "versions": { + "v1": { + "created": "2018-10-09T11:20:29.209164Z", + "message": "Initial Ingest", + "state": { + "cf83e1...a3e": [ "myfirstbag/bagit.txt" ], + "85f2b0...007": [ "myfirstbag/data/27613-h/images/q172.png" ], + "d66d80...8bd": [ "myfirstbag/data/27613-h/images/q172.txt" ], + "f15428...83f": [ "myfirstbag/manifest-md5.txt" ] + }, + "user": { + "address": "mailto:someone@example.org", + "name": "Some One" + } + }, + "v2": { + "created": "2018-10-31T11:20:29.209164Z", + "message": "Added new images", + "state": { + "cf83e1...a3e": [ "myfirstbag/bagit.txt" ], + "85f2b0...007": [ "myfirstbag/data/27613-h/images/q172.png" ], + "d66d80...8bd": [ "myfirstbag/data/27613-h/images/q172.txt" ], + "2b0ff8...620": [ "myfirstbag/manifest-md5.txt" ], + "921d36...877": [ "myfirstbag/data/27614-h/images/q173.png" ], + "b8bdf1...927": [ "myfirstbag/data/27614-h/images/q173.txt" ] + }, + "user": { + "address": "mailto:somebody-else@example.org", + "name": "Somebody Else" + } + } + } +} +``` + +### 5.5 Moab in an OCFL Object +{: #example-moab-in-ocfl} + +\[[Moab](#ref-moab)\] is an archive information package format developed and used by Stanford University. Many of the +ideas in Moab have been refined by the OCFL, and the OCFL is designed to give institutions currently using Moab an easy +path to adoption. + +Converting content preserved in a Moab object in a way that does not compromise existing Moab access patterns whilst +allowing for the eventual use of OCFL-native workflows requires a Moab to OCFL conversion tool. This tool uses the +Moab-versioning gem to extract deltas and digests of the Moab data directory for each Moab version and translate those +into version `state` blocks in an OCFL inventory file, which would be placed in the root directory of the Moab object. +The content of the `data` directory in the Moab version directories (and thus, the bitstreams that Moab is preserving) +is tracked by OCFL, via the `contentDirectory` value. The contents of the Moab `manifests` directories are not tracked, +as the intention is not to encapsulate a Moab object inside an OCFL object, but rather to migrate Moab's preserved +bitstreams into an OCFL object without compromising legacy access patterns. + +During the transitionary period the OCFL inventory file exists only in the root of the Moab object. Once OCFL-native +object creation workflows have been completed, future versions of that object will be fully OCFL compliant - new +versions will no longer have a manifests directory and will contain an OCFL inventory file. At this stage OCFL tools +will be able to access all versions of the content originally preserved by Moab. + +Consider the following sample Moab object: + +``` +[object root] + └── bj102hs9687 + ├── v0001 + │   ├── data + │   │   ├── content + │   │   │   ├── eric-smith-dissertation-augmented.pdf + │   │   │   └── eric-smith-dissertation.pdf + │   │   └── metadata + │   │   ├── contentMetadata.xml + │   │   ├── descMetadata.xml + │   │   ├── identityMetadata.xml + │   │   ├── provenanceMetadata.xml + │   │   ├── relationshipMetadata.xml + │   │   ├── rightsMetadata.xml + │   │   ├── technicalMetadata.xml + │   │   └── versionMetadata.xml + │   └── manifests + │   ├── fileInventoryDifference.xml + │   ├── manifestInventory.xml + │   ├── signatureCatalog.xml + │   ├── versionAdditions.xml + │   └── versionInventory.xml + ├── v0002 + │   ├── data + │   │   └── metadata + │  │   ├── contentMetadata.xml + │  │   ├── embargoMetadata.xml + │  │   ├── events.xml + │  │   ├── identityMetadata.xml + │  │   ├── provenanceMetadata.xml + │  │   ├── relationshipMetadata.xml + │  │   ├── rightsMetadata.xml + │  │   ├── versionMetadata.xml + │  │   └── workflows.xml + │  └── manifests + │  ├── fileInventoryDifference.xml + │  ├── manifestInventory.xml + │  ├── signatureCatalog.xml + │  ├── versionAdditions.xml + │  └── versionInventory.xml + └── v0003 + ├── data + │   └── metadata + │   ├── contentMetadata.xml + │   ├── descMetadata.xml + │   ├── embargoMetadata.xml + │   ├── events.xml + │   ├── identityMetadata.xml + │   ├── provenanceMetadata.xml + │   ├── rightsMetadata.xml + │   ├── technicalMetadata.xml + │   ├── versionMetadata.xml + │   └── workflows.xml + └── manifests + ├── fileInventoryDifference.xml + ├── manifestInventory.xml + ├── signatureCatalog.xml + ├── versionAdditions.xml + └── versionInventory.xml +``` + +An OCFL inventory that tracks the `data` directory would include a manifest comprised as follows. Note the absence of +the `manifests` directory, as we are not encapsulating the Moab object in an OCFL object, and the presence of +`contentDirectory` to specify `data` as the preserved content directory: + +```json +{ + "digestAlgorithm": "sha512", + "head": "v3", + "id": "druid:bj102hs9687", + "contentDirectory": "data", + "manifest": { + "98114a...588": [ "v0001/data/content/eric-smith-dissertation-augmented.pdf" ], + "7f3d87...15b": [ "v0001/data/content/eric-smith-dissertation.pdf" ], + "6d19f0...064": [ "v0001/data/metadata/technicalMetadata.xml" ], + "6e4be4...375": [ "v0001/data/metadata/provenanceMetadata.xml" ], + "d8a319...d0f": [ "v0001/data/metadata/descMetadata.xml" ], + "de823a...acc": [ "v0001/data/metadata/rightsMetadata.xml" ], + "080617...40c": [ "v0001/data/metadata/identityMetadata.xml" ], + "e15267...58d": [ "v0001/data/metadata/versionMetadata.xml" ], + "0d9e0b...9a2": [ "v0001/data/metadata/contentMetadata.xml" ], + "dd9289...31d": [ "v0001/data/metadata/relationshipMetadata.xml" ], + "7519c5...63f": [ "v0002/data/metadata/provenanceMetadata.xml" ], + "abda4c...622": [ "v0002/data/metadata/workflows.xml" ], + "76549e...b2b": [ "v0002/data/metadata/rightsMetadata.xml" ], + "bdc4d6...3b6": [ "v0002/data/metadata/events.xml" ], + "7b331c...f9b": [ "v0002/data/metadata/identityMetadata.xml" ], + "80ceac...b9c": [ "v0002/data/metadata/versionMetadata.xml" ], + "4853a2...fbe": [ "v0002/data/metadata/contentMetadata.xml" ], + "1d5090...f5f": [ "v0002/data/metadata/relationshipMetadata.xml" ], + "f209bf...ceb": [ "v0002/data/metadata/embargoMetadata.xml" ], + "dd9125...d4b": [ "v0003/data/metadata/technicalMetadata.xml" ], + "d9e177...477": [ "v0003/data/metadata/provenanceMetadata.xml" ], + "4f5908...4f5": [ "v0003/data/metadata/workflows.xml" ], + "e64db0...500": [ "v0003/data/metadata/descMetadata.xml" ], + "05fa51...818": [ "v0003/data/metadata/rightsMetadata.xml" ], + "d70dd8...5ad": [ "v0003/data/metadata/events.xml" ], + "509a2d...dc6": [ "v0003/data/metadata/identityMetadata.xml" ], + "548066...893": [ "v0003/data/metadata/versionMetadata.xml" ], + "93884e...aae": [ "v0003/data/metadata/contentMetadata.xml" ], + "4c5ab4...b02": [ "v0003/data/metadata/embargoMetadata.xml" ] + }, + "type": "https://ocfl.io/1.1/spec/#inventory", + "versions": { + "v1": { + "created": "2019-03-14T20:31:00Z", + "state": { + "98114a...588": [ "content/eric-smith-dissertation-augmented.pdf" ], + "7f3d87...15b": [ "content/eric-smith-dissertation.pdf" ], + "6d19f0...064": [ "metadata/technicalMetadata.xml" ], + "6e4be4...375": [ "metadata/provenanceMetadata.xml" ], + "d8a319...d0f": [ "metadata/descMetadata.xml" ], + "de823a...acc": [ "metadata/rightsMetadata.xml" ], + "080617...40c": [ "metadata/identityMetadata.xml" ], + "e15267...58d": [ "metadata/versionMetadata.xml" ], + "0d9e0b...9a2": [ "metadata/contentMetadata.xml" ], + "dd9289...31d": [ "metadata/relationshipMetadata.xml" ] + } + }, + "v2": { + "created": "2019-03-24T09:22:00Z", + "state": { + "98114a...588": [ "content/eric-smith-dissertation-augmented.pdf" ], + "7f3d87...15b": [ "content/eric-smith-dissertation.pdf" ], + "6d19f0...064": [ "metadata/technicalMetadata.xml" ], + "7519c5...63f": [ "metadata/provenanceMetadata.xml" ], + "d8a319...d0f": [ "metadata/descMetadata.xml" ], + "76549e...b2b": [ "metadata/rightsMetadata.xml" ], + "7b331c...f9b": [ "metadata/identityMetadata.xml" ], + "80ceac...b9c": [ "metadata/versionMetadata.xml" ], + "4853a2...fbe": [ "metadata/contentMetadata.xml" ], + "1d5090...f5f": [ "metadata/relationshipMetadata.xml" ], + "abda4c...622": [ "metadata/workflows.xml" ], + "bdc4d6...3b6": [ "metadata/events.xml" ], + "f209bf...ceb": [ "metadata/embargoMetadata.xml" ] + } + }, + "v3": { + "created": "2019-04-02T11:07:00Z", + "state": { + "98114a...588": [ "content/eric-smith-dissertation-augmented.pdf" ], + "7f3d87...15b": [ "content/eric-smith-dissertation.pdf" ], + "dd9125...d4b": [ "metadata/technicalMetadata.xml" ], + "d9e177...477": [ "metadata/provenanceMetadata.xml" ], + "e64db0...500": [ "metadata/descMetadata.xml" ], + "05fa51...818": [ "metadata/rightsMetadata.xml" ], + "509a2d...dc6": [ "metadata/identityMetadata.xml" ], + "548066...893": [ "metadata/versionMetadata.xml" ], + "93884e...aae": [ "metadata/contentMetadata.xml" ], + "1d5090...f5f": [ "metadata/relationshipMetadata.xml" ], + "4f5908...4f5": [ "metadata/workflows.xml" ], + "d70dd8...5ad": [ "metadata/events.xml" ], + "4c5ab4...b02": [ "metadata/embargoMetadata.xml" ] + } + } + } +} +``` + +### 5.6 Example Extended OCFL Storage Root +{: #example-extended-storage-root} + +The following example OCFL Storage Root has an extension containing custom content. The OCFL Storage Root itself remains +valid. + +``` +[storage root] + ├── 0=ocfl_1.1 + ├── extensions + │   └── 0000-example-extension + │   └── file-example.txt + ├── ocfl_1.1.txt + └── ocfl_layout.json +``` + +### 5.7 Example Extended OCFL Object +{: #example-extended-object} + +The following example OCFL Object has an extension containing custom content. The OCFL Object itself remains valid. + +``` +[object root] + ├── 0=ocfl_object_1.1 + ├── inventory.json + ├── inventory.json.sha512 + ├── extensions + │   └── 0000-example-extension + │   └── file1-draft.txt + └── v1 + ├── inventory.json + ├── inventory.json.sha512 + └── content + └── file.txt +``` + +## 6. References +{: #references} + +### 6.1 Normative References +{: #normative-references} + +**\[FIPS-180-4]** FIPS PUB 180-4 Secure Hash Standard. U.S. Department of Commerce/National +Institute of Standards and Technology. URL: + +**\[NAMASTE]** Directory Description with Namaste Tags. J. Kunze.9 November 2009. URL: + + +**\[RFC1321]** The MD5 Message-Digest Algorithm. R. Rivest. IETF. April 1992. Informational. +URL: + +**\[RFC2119]** Key words for use in RFCs to Indicate Requirement Levels. S. Bradner. IETF. +March 1997. Best Current Practice. URL: + +**\[RFC3339]** Date and Time on the Internet: Timestamps. G. Klyne; C. Newman. IETF. July 2002. +Proposed Standard. URL: + +**\[RFC3986]** Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; +L. Masinter. IETF. January 2005. Internet Standard. URL: + +**\[RFC4648]** The Base16, Base32, and Base64 Data Encodings. S. Josefsson. IETF. October 2006. +Proposed Standard. URL: + +**\[RFC7693]** The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC). M-J. +Saarinen, Ed.; J-P. Aumasson. IETF. November 2015. Informational. URL: + +**\[RFC8259]** The JavaScript Object Notation (JSON) Data Interchange Format. T. Bray, Ed.. +IETF. December 2017\. Internet Standard. URL: + +### 6.2 Informative References +{: #informative-references} + +**\[BagIt]** The BagIt File Packaging Format (V1.0). J. Kunze; J. Littman; E. Madden; J. +Scancella; C. Adams. 17 September 2018. URL: + +**\[Digest-Algorithms-Extension]** OCFL Community Extension 0001: Digest +Algorithms. OCFL Editors.URL: + +**\[JSON-Schema]** JSON Schema Validation: A Vocabulary for Structural Validation of JSON. +A. Wright; H Andrews.20 September 2018. URL: + +**\[Moab]** The Moab Design for Digital Object Versioning. Richard Anderson.15 July 2013. URL: + + +**\[OAIS]** Reference Model for an Open Archival Information System (OAIS), Issue 2. June 2012. +URL: + +**\[OCFL-Implementation-Notes]** OCFL Implementation Notes v1.1. URL: + + +**\[PairTree]** Pairtrees for Object Storage. J. Kunze; M. Haye; E. Hetzner; M. Reyes; C. +Snavely. 12 August 2008\. URL: + +**\[RFC6068]** The 'mailto' URI Scheme. M. Duerst; L. Masinter; J. Zawinski. IETF. October 2010. +Proposed Standard. URL: + +**\[RFC7764]** Guidance on Markdown: Design Philosophies, Stability Strategies, and Select +Registrations. S. Leonard. IETF. March 2016. URL: + +**\[RFC8141]** Uniform Resource Names (URNs). P. Saint-Andre; J. Klensin. IETF. April 2017. +Proposed Standard. URL: diff --git a/1.1.0/spec/validation-codes.md b/1.1.0/spec/validation-codes.md new file mode 100644 index 0000000..661a754 --- /dev/null +++ b/1.1.0/spec/validation-codes.md @@ -0,0 +1,142 @@ +# OCFL Validation Codes v1.1 + +## Requirements + + - Levels of validation: ERROR, WARNING, INFO. The ERROR level corresponds with MUST in the specification, the WARNING level corresponds with SHOULD in the specification, and the INFO level is validator implementation specific. + - OCFL Spec should define error codes to which validators MUST refer; these error codes will refer to sections in the OCFL Spec. + - Validators MUST validate OCFL Objects + - Validators MUST validate OCFL Storage Roots + +## Object Errors (corresponding with MUST in specification) + +| Code | Description | Reference +| --- | --- | --- | +| E001 | 'The OCFL Object Root must not contain files or directories other than those specified in the following sections.' | https://ocfl.io/1.1/spec/#E001 +| E002 | 'The version declaration must be formatted according to the NAMASTE specification.' | https://ocfl.io/1.1/spec/#E002 +| E003 | 'There must be exactly one version declaration file in the base directory of the OCFL Object Root giving the OCFL version in the filename.' | https://ocfl.io/1.1/spec/#E003 +| E004 | 'The [version declaration] filename MUST conform to the pattern T=dvalue, where T must be 0, and dvalue must be ocfl_object_, followed by the OCFL specification version number.' | https://ocfl.io/1.1/spec/#E004 +| E005 | 'The [version declaration] filename must conform to the pattern T=dvalue, where T MUST be 0, and dvalue must be ocfl_object_, followed by the OCFL specification version number.' | https://ocfl.io/1.1/spec/#E005 +| E006 | 'The [version declaration] filename must conform to the pattern T=dvalue, where T must be 0, and dvalue MUST be ocfl_object_, followed by the OCFL specification version number.' | https://ocfl.io/1.1/spec/#E006 +| E007 | 'The text contents of the [version declaration] file must be the same as dvalue, followed by a newline (\n).' | https://ocfl.io/1.1/spec/#E007 +| E008 | 'OCFL Object content must be stored as a sequence of one or more versions.' | https://ocfl.io/1.1/spec/#E008 +| E009 | 'The version number sequence MUST start at 1 and must be continuous without missing integers.' | https://ocfl.io/1.1/spec/#E009 +| E010 | 'The version number sequence must start at 1 and MUST be continuous without missing integers.' | https://ocfl.io/1.1/spec/#E010 +| E011 | 'If zero-padded version directory numbers are used then they must start with the prefix v and then a zero.' | https://ocfl.io/1.1/spec/#E011 +| E012 | 'All version directories of an object must use the same naming convention: either a non-padded version directory number, or a zero-padded version directory number of consistent length.' | https://ocfl.io/1.1/spec/#E012 +| E013 | 'Operations that add a new version to an object must follow the version directory naming convention established by earlier versions.' | https://ocfl.io/1.1/spec/#E013 +| E014 | 'In all cases, references to files inside version directories from inventory files must use the actual version directory names.' | https://ocfl.io/1.1/spec/#E014 +| E015 | 'There must be no other files as children of a version directory, other than an inventory file and a inventory digest.' | https://ocfl.io/1.1/spec/#E015 +| E016 | 'Version directories must contain a designated content sub-directory if the version contains files to be preserved, and should not contain this sub-directory otherwise.' | https://ocfl.io/1.1/spec/#E016 +| E017 | 'The contentDirectory value MUST NOT contain the forward slash (/) path separator and must not be either one or two periods (. or ..).' | https://ocfl.io/1.1/spec/#E017 +| E018 | 'The contentDirectory value must not contain the forward slash (/) path separator and MUST NOT be either one or two periods (. or ..).' | https://ocfl.io/1.1/spec/#E018 +| E019 | 'If the key contentDirectory is set, it MUST be set in the first version of the object and must not change between versions of the same object.' | https://ocfl.io/1.1/spec/#E019 +| E020 | 'If the key contentDirectory is set, it must be set in the first version of the object and MUST NOT change between versions of the same object.' | https://ocfl.io/1.1/spec/#E020 +| E021 | 'If the key contentDirectory is not present in the inventory file then the name of the designated content sub-directory must be content.' | https://ocfl.io/1.1/spec/#E021 +| E022 | 'OCFL-compliant tools (including any validators) must ignore all directories in the object version directory except for the designated content directory.' | https://ocfl.io/1.1/spec/#E022 +| E023 | 'Every file within a version\'s content directory must be referenced in the manifest section of the inventory.' | https://ocfl.io/1.1/spec/#E023 +| E024 | 'There must not be empty directories within a version\'s content directory.' | https://ocfl.io/1.1/spec/#E024 +| E025 | 'For content-addressing, OCFL Objects must use either sha512 or sha256, and should use sha512.' | https://ocfl.io/1.1/spec/#E025 +| E026 | 'For storage of additional fixity values, or to support legacy content migration, implementers must choose from the following controlled vocabulary of digest algorithms, or from a list of additional algorithms given in the [Digest-Algorithms-Extension].' | https://ocfl.io/1.1/spec/#E026 +| E027 | 'OCFL clients must support all fixity algorithms given in the table below, and may support additional algorithms from the extensions.' | https://ocfl.io/1.1/spec/#E027 +| E028 | 'Optional fixity algorithms that are not supported by a client must be ignored by that client.' | https://ocfl.io/1.1/spec/#E028 +| E029 | 'SHA-1 algorithm defined by [FIPS-180-4] and must be encoded using hex (base16) encoding [RFC4648].' | https://ocfl.io/1.1/spec/#E029 +| E030 | 'SHA-256 algorithm defined by [FIPS-180-4] and must be encoded using hex (base16) encoding [RFC4648].' | https://ocfl.io/1.1/spec/#E030 +| E031 | 'SHA-512 algorithm defined by [FIPS-180-4] and must be encoded using hex (base16) encoding [RFC4648].' | https://ocfl.io/1.1/spec/#E031 +| E032 | '[blake2b-512] must be encoded using hex (base16) encoding [RFC4648].' | https://ocfl.io/1.1/spec/#E032 +| E033 | 'An OCFL Object Inventory MUST follow the [JSON] structure described in this section and must be named inventory.json.' | https://ocfl.io/1.1/spec/#E033 +| E034 | 'An OCFL Object Inventory must follow the [JSON] structure described in this section and MUST be named inventory.json.' | https://ocfl.io/1.1/spec/#E034 +| E035 | 'The forward slash (/) path separator must be used in content paths in the manifest and fixity blocks within the inventory.' | https://ocfl.io/1.1/spec/#E035 +| E036 | 'An OCFL Object Inventory must include the following keys: [id, type, digestAlgorithm, head]' | https://ocfl.io/1.1/spec/#E036 +| E037 | '[id] must be unique in the local context, and should be a URI [RFC3986].' | https://ocfl.io/1.1/spec/#E037 +| E038 | 'In the object root inventory [the type value] must be the URI of the inventory section of the specification version matching the object conformance declaration.' | https://ocfl.io/1.1/spec/#E038 +| E039 | '[digestAlgorithm] must be the algorithm used in the manifest and state blocks.' | https://ocfl.io/1.1/spec/#E039 +| E040 |[head] must be the version directory name with the highest version number.' | https://ocfl.io/1.1/spec/#E040 +| E041 | 'In addition to these keys, there must be two other blocks present, manifest and versions, which are discussed in the next two sections.' | https://ocfl.io/1.1/spec/#E041 +| E042 | 'Content paths within a manifest block must be relative to the OCFL Object Root.' | https://ocfl.io/1.1/spec/#E042 +| E043 | 'An OCFL Object Inventory must include a block for storing versions.' | https://ocfl.io/1.1/spec/#E043 +| E044 | 'This block MUST have the key of versions within the inventory, and it must be a JSON object.' | https://ocfl.io/1.1/spec/#E044 +| E045 | 'This block must have the key of versions within the inventory, and it MUST be a JSON object.' | https://ocfl.io/1.1/spec/#E045 +| E046 | 'The keys of [the versions object] must correspond to the names of the version directories used.' | https://ocfl.io/1.1/spec/#E046 +| E047 | 'Each value [of the versions object] must be another JSON object that characterizes the version, as described in the 3.5.3.1 Version section.' | https://ocfl.io/1.1/spec/#E047 +| E048 | 'A JSON object to describe one OCFL Version, which must include the following keys: [created, state]' | https://ocfl.io/1.1/spec/#E048 +| E049 | '[the value of the "created" key] must be expressed in the Internet Date/Time Format defined by [RFC3339].' | https://ocfl.io/1.1/spec/#E049 +| E050 | 'The keys of [the "state" JSON object] are digest values, each of which must correspond to an entry in the manifest of the inventory.' | https://ocfl.io/1.1/spec/#E050 +| E051 | 'The logical path [value of a "state" digest key] must be interpreted as a set of one or more path elements joined by a / path separator.' | https://ocfl.io/1.1/spec/#E051 +| E052 | '[logical] Path elements must not be ., .., or empty (//).' | https://ocfl.io/1.1/spec/#E052 +| E053 | 'Additionally, a logical path must not begin or end with a forward slash (/).' | https://ocfl.io/1.1/spec/#E053 +| E054 | 'The value of the user key must contain a user name key, "name" and should contain an address key, "address".' | https://ocfl.io/1.1/spec/#E054 +| E055 | 'If present, [the fixity] block must have the key of fixity within the inventory.' | https://ocfl.io/1.1/spec/#E055 +| E056 | 'The fixity block must contain keys corresponding to the controlled vocabulary given in the digest algorithms listed in the Digests section, or in a table given in an Extension.' | https://ocfl.io/1.1/spec/#E056 +| E057 | 'The value of the fixity block for a particular digest algorithm must follow the structure of the manifest block; that is, a key corresponding to the digest value, and an array of content paths that match that digest.' | https://ocfl.io/1.1/spec/#E057' +| E058 | 'Every occurrence of an inventory file must have an accompanying sidecar file stating its digest.' | https://ocfl.io/1.1/spec/#E058 +| E059 | 'This value must match the value given for the digestAlgorithm key in the inventory.' | https://ocfl.io/1.1/spec/#E059 +| E060 | 'The digest sidecar file must contain the digest of the inventory file.' | https://ocfl.io/1.1/spec/#E060 +| E061 | '[The digest sidecar file] must follow the format: DIGEST inventory.json' | https://ocfl.io/1.1/spec/#E061 +| E062 | 'The digest of the inventory must be computed only after all changes to the inventory have been made, and thus writing the digest sidecar file is the last step in the versioning process.' | https://ocfl.io/1.1/spec/#E062 +| E063 | 'Every OCFL Object must have an inventory file within the OCFL Object Root, corresponding to the state of the OCFL Object at the current version.' | https://ocfl.io/1.1/spec/#E063 +| E064 | 'Where an OCFL Object contains inventory.json in version directories, the inventory file in the OCFL Object Root must be the same as the file in the most recent version.' | https://ocfl.io/1.1/spec/#E064 +| E066 | 'Each version block in each prior inventory file must represent the same object state as the corresponding version block in the current inventory file.' | https://ocfl.io/1.1/spec/#E066 +| E067 | 'The extensions directory must not contain any files or sub-directories other than extension sub-directories.' | https://ocfl.io/1.1/spec/#E067 +| E069 | 'An OCFL Storage Root MUST contain a Root Conformance Declaration identifying it as such.' | https://ocfl.io/1.1/spec/#E069 +| E070 | 'If present, [the ocfl_layout.json document] MUST include the following two keys in the root JSON object: [extension, description]' | https://ocfl.io/1.1/spec/#E070 +| E071 | 'The value of the [ocfl_layout.json] extension key must be the registered extension name for the extension defining the arrangement under the storage root.' | https://ocfl.io/1.1/spec/#E071 +| E072 | 'The directory hierarchy used to store OCFL Objects MUST NOT contain files that are not part of an OCFL Object.' | https://ocfl.io/1.1/spec/#E072 +| E073 | 'Empty directories MUST NOT appear under a storage root.' | https://ocfl.io/1.1/spec/#E073 +| E074 | 'Although implementations may require multiple OCFL Storage Roots - that is, several logical or physical volumes, or multiple "buckets" in an object store - each OCFL Storage Root MUST be independent.' | https://ocfl.io/1.1/spec/#E074 +| E075 | 'The OCFL version declaration MUST be formatted according to the NAMASTE specification.' | https://ocfl.io/1.1/spec/#E075 +| E076 | 'There must be exactly one version declaration file in the base directory of the OCFL Storage Root giving the OCFL version in the filename.' | https://ocfl.io/1.1/spec/#E076 +| E077 | '[The OCFL version declaration filename] MUST conform to the pattern T=dvalue, where T must be 0, and dvalue must be ocfl_, followed by the OCFL specification version number.' | https://ocfl.io/1.1/spec/#E077 +| E078 | '[The OCFL version declaration filename] must conform to the pattern T=dvalue, where T MUST be 0, and dvalue must be ocfl_, followed by the OCFL specification version number.' | https://ocfl.io/1.1/spec/#E078 +| E079 | '[The OCFL version declaration filename] must conform to the pattern T=dvalue, where T must be 0, and dvalue MUST be ocfl_, followed by the OCFL specification version number.' | https://ocfl.io/1.1/spec/#E079 +| E080 | 'The text contents of [the OCFL version declaration file] MUST be the same as dvalue, followed by a newline (\n).' | https://ocfl.io/1.1/spec/#E080 +| E081 | 'OCFL Objects within the OCFL Storage Root also include a conformance declaration which MUST indicate OCFL Object conformance to the same or earlier version of the specification.' | https://ocfl.io/1.1/spec/#E081 +| E082 | 'OCFL Object Roots MUST be stored either as the terminal resource at the end of a directory storage hierarchy or as direct children of a containing OCFL Storage Root.' | https://ocfl.io/1.1/spec/#E082 +| E083 | 'There MUST be a deterministic mapping from an object identifier to a unique storage path.' | https://ocfl.io/1.1/spec/#E083 +| E084 | 'Storage hierarchies MUST NOT include files within intermediate directories.' | https://ocfl.io/1.1/spec/#E084 +| E085 | 'Storage hierarchies MUST be terminated by OCFL Object Roots.' | https://ocfl.io/1.1/spec/#E085 +| E087 | 'An OCFL validator MUST ignore any files in the storage root it does not understand.' | https://ocfl.io/1.1/spec/#E087 +| E088 | 'An OCFL Storage Root MUST NOT contain directories or sub-directories other than as a directory hierarchy used to store OCFL Objects or for storage root extensions.' | https://ocfl.io/1.1/spec/#E088 +| E089 | 'If the preservation of non-OCFL-compliant features is required then the content MUST be wrapped in a suitable disk or filesystem image format which OCFL can treat as a regular file.' | https://ocfl.io/1.1/spec/#E089 +| E090 | 'Hard and soft (symbolic) links are not portable and MUST NOT be used within OCFL Storage hierarchies.' | https://ocfl.io/1.1/spec/#E090 +| E091 | 'Filesystems MUST preserve the case of OCFL filepaths and filenames.' | https://ocfl.io/1.1/spec/#E091 +| E092 | 'The value for each key in the manifest must be an array containing the content paths of files in the OCFL Object that have content with the given digest.' | https://ocfl.io/1.1/spec/#E092 +| E093 | 'Where included in the fixity block, the digest values given must match the digests of the files at the corresponding content paths.' | https://ocfl.io/1.1/spec/#E093 +| E094 | 'The value of [the message] key is freeform text, used to record the rationale for creating this version. It must be a JSON string.' | https://ocfl.io/1.1/spec/#E094 +| E095 | 'Within a version, logical paths must be unique and non-conflicting, so the logical path for a file cannot appear as the initial part of another logical path.' | https://ocfl.io/1.1/spec/#E095 +| E096 | 'As JSON keys are case sensitive, while digests may not be, there is an additional requirement that each digest value must occur only once in the manifest regardless of case.' | https://ocfl.io/1.1/spec/#E096 +| E097 | 'As JSON keys are case sensitive, while digests may not be, there is an additional requirement that each digest value must occur only once in the fixity block for any digest algorithm, regardless of case.' | https://ocfl.io/1.1/spec/#E097 +| E098 | 'The content path must be interpreted as a set of one or more path elements joined by a / path separator.' | https://ocfl.io/1.1/spec/#E098 +| E099 | '[content] path elements must not be ., .., or empty (//).' | https://ocfl.io/1.1/spec/#E099 +| E100 | 'A content path must not begin or end with a forward slash (/).' | https://ocfl.io/1.1/spec/#E100 +| E101 | 'Within an inventory, content paths must be unique and non-conflicting, so the content path for a file cannot appear as the initial part of another content path.' | https://ocfl.io/1.1/spec/#E101 +| E102 | 'An inventory file must not contain keys that are not specified.' | https://ocfl.io/1.1/spec/#E102 +| E103 | 'Each version directory within an OCFL Object MUST conform to either the same or a later OCFL specification version as the preceding version directory.' | https://ocfl.io/1.1/spec/#E103 +| E104 | 'Version directory names MUST be constructed by prepending v to the version number.' | https://ocfl.io/1.1/spec/#E104 +| E105 | 'The version number MUST be taken from the sequence of positive, base-ten integers: 1, 2, 3, etc.' | https://ocfl.io/1.1/spec/#E105 +| E106 | 'The value of the manifest key MUST be a JSON object.' | https://ocfl.io/1.1/spec/#E106 +| E107 | 'The value of the manifest key must be a JSON object, and each key MUST correspond to a digest value key found in one or more state blocks of the current and/or previous version blocks of the OCFL Object.' | https://ocfl.io/1.1/spec/#E107 +| E108 | 'The contentDirectory value MUST represent a direct child directory of the version directory in which it is found.' | https://ocfl.io/1.1/spec/#E108 +| E110 | 'A unique identifier for the OCFL Object MUST NOT change between versions of the same object.' | https://ocfl.io/1.1/spec/#E110 +| E111 | 'If present, [the value of the fixity key] MUST be a JSON object, which may be empty.' | https://ocfl.io/1.1/spec/#E111 +| E112 | 'The extensions directory must not contain any files or sub-directories other than extension sub-directories.' | https://ocfl.io/1.1/spec/#E112 + + +## Warnings (corresponding with SHOULD in specification) + +| Code | Reason | Spec Reference | +| --- | --- | --- | +| W001 | 'Implementations SHOULD use version directory names constructed without zero-padding the version number, ie. v1, v2, v3, etc.'' | https://ocfl.io/1.1/spec/#W001 +| W002 | 'The version directory SHOULD NOT contain any directories other than the designated content sub-directory. Once created, the contents of a version directory are expected to be immutable.' | https://ocfl.io/1.1/spec/#W002 +| W003 | 'Version directories must contain a designated content sub-directory if the version contains files to be preserved, and SHOULD NOT contain this sub-directory otherwise.'| https://ocfl.io/1.1/spec/#W003 +| W004 | 'For content-addressing, OCFL Objects SHOULD use sha512.' | https://ocfl.io/1.1/spec/#W004 +| W005 | 'The OCFL Object Inventory id SHOULD be a URI.' | https://ocfl.io/1.1/spec/#W005 +| W007 | 'In the OCFL Object Inventory, the JSON object describing an OCFL Version, SHOULD include the message and user keys.' | https://ocfl.io/1.1/spec/#W007 +| W008 | 'In the OCFL Object Inventory, in the version block, the value of the user key SHOULD contain an address key, address.' | https://ocfl.io/1.1/spec/#W008 +| W009 | 'In the OCFL Object Inventory, in the version block, the address value SHOULD be a URI: either a mailto URI [RFC6068] with the e-mail address of the user or a URL to a personal identifier, e.g., an ORCID iD.' | https://ocfl.io/1.1/spec/#W009 +| W010 | 'In addition to the inventory in the OCFL Object Root, every version directory SHOULD include an inventory file that is an Inventory of all content for versions up to and including that particular version.' | https://ocfl.io/1.1/spec/#W010 +| W011 | 'In the case that prior version directories include an inventory file, the values of the created, message and user keys in each version block in each prior inventory file SHOULD have the same values as the corresponding keys in the corresponding version block in the current inventory file.' | https://ocfl.io/1.1/spec/#W011 +| W012 | 'Implementers SHOULD use the logs directory, if present, for storing files that contain a record of actions taken on the object.' | https://ocfl.io/1.1/spec/#W012 +| W013 | 'In an OCFL Object, extension sub-directories SHOULD be named according to a registered extension name.' | https://ocfl.io/1.1/spec/#W013 +| W014 | 'Storage hierarchies within the same OCFL Storage Root SHOULD use just one layout pattern.' | https://ocfl.io/1.1/spec/#W014 +| W015 | 'Storage hierarchies within the same OCFL Storage Root SHOULD consistently use either a directory hierarchy of OCFL Objects or top-level OCFL Objects.' | https://ocfl.io/1.1/spec/#W015 +| W016 | 'In the Storage Root, extension sub-directories SHOULD be named according to a registered extension name.' | https://ocfl.io/1.1/spec/#W016 diff --git a/1.1.1/implementation-notes/index.html b/1.1.1/implementation-notes/index.html new file mode 100644 index 0000000..c36c7f5 --- /dev/null +++ b/1.1.1/implementation-notes/index.html @@ -0,0 +1,5 @@ + + +Redirecting to https://ocfl.io/1.1/implementation-notes/ + + diff --git a/1.1.1/spec/index.html b/1.1.1/spec/index.html new file mode 100644 index 0000000..d3c7ace --- /dev/null +++ b/1.1.1/spec/index.html @@ -0,0 +1,5 @@ + + +Redirecting to https://ocfl.io/1.1/spec/ + + diff --git a/1.1.1/spec/validation-codes.html b/1.1.1/spec/validation-codes.html new file mode 100644 index 0000000..ef20b77 --- /dev/null +++ b/1.1.1/spec/validation-codes.html @@ -0,0 +1,5 @@ + + +Redirecting to https://ocfl.io/1.1/spec/validation-codes.html + + diff --git a/1.1/implementation-notes/index.md b/1.1/implementation-notes/index.md index b1ef63d..c424b96 100644 --- a/1.1/implementation-notes/index.md +++ b/1.1/implementation-notes/index.md @@ -1,21 +1,23 @@ --- no_site_title: true --- -OCFL Hand-drive logo +OCFL Hand-drive logo # Implementation Notes, Oxford Common File Layout Specification {:.no_toc} -7 October 2022 +**Unreleased Latest Editors' Draft** **This Version:** -* +* **Latest Published Version:** * **Editors:** -* [Neil Jefferies](https://orcid.org/0000-0003-3311-3741), [Bodleian Libraries, University of Oxford](http://www.bodleian.ox.ac.uk/) +* [Neil Jefferies](https://orcid.org/0000-0003-3311-3741), \ +[Bodleian Libraries, University of Oxford](http://www.bodleian.ox.ac.uk/) * [Rosalyn Metz](https://orcid.org/0000-0003-3526-2230), [Emory University](https://web.library.emory.edu/) * [Julian Morley](https://orcid.org/0000-0003-4176-1933), [Stanford University](https://library.stanford.edu/) * [Simeon Warner](https://orcid.org/0000-0002-7970-7855), [Cornell University](https://www.library.cornell.edu/) @@ -23,12 +25,12 @@ no_site_title: true **Additional Documents:** -* [Specification](https://ocfl.io/1.1/spec/) -* [Validation Codes](https://ocfl.io/1.1/spec/validation-codes.html) +* [Specification](https://ocfl.io/draft/spec/) +* [Validation Codes](https://ocfl.io/draft/spec/validation-codes.html) * [Extensions](https://github.com/OCFL/extensions/) **Previous version:** -* +* **Repository:** * [Github](https://github.com/ocfl/spec) @@ -79,18 +81,23 @@ other digital preservation processes that might be implemented without requiring ### 1.2 Fixity {: #fixity} -The digests in the manifest are used by the OCFL for content addressability rather than fixity but they are suitable for -use as part of a fixity regime, and the manifest block usefully identifies all the files in an object. OCFL validation -also requires that digests and files match. However, while the characteristics of digest algorithms that make them -suitable for fixity checking and content addressing are closely related, they are not identical. In particular, fixity -against malicious tampering requires that a digest computation is hard to reverse, which is not a requirement for -content addressing. It is this aspect which is the most frequent target for cryptoanalytic attack. +The digests in the [manifest block](../spec/#manifest) are used by the OCFL for content addressability rather than +fixity, but they are also suitable for use as part of a fixity regime. The manifest block usefully identifies all +the files in an object and OCFL validation requires that digests and files match. + +While the characteristics of digest algorithms that make them suitable for fixity checking and content addressing are +closely related, they are not identical. In particular, fixity against malicious tampering requires that a digest +computation is hard to reverse, which is not a requirement for content addressing. It is this aspect which is the most +frequent target for cryptoanalytic attack. Consequently, it is sensible to allow additional or alternative fixity algorithms to be used. These may be made in a [fixity block](../spec/#fixity) which has the same layout as a manifest block but permits a broader range of algorithms. -The OCFL will consider a fixity block valid if all the files referenced in the block exist but the OCFL does not -validate digests for all possible algorithms. The fixity block does not have to include all the files in an object to -permit legacy fixity to be imported without requiring continued use of obsolete digest algorithms. +OCFL validation will consider a fixity block valid if all the files referenced in the block exist but the OCFL tools +may not validate digests for all possible algorithms. The fixity block does not have to include all the files in an +object in order to permit legacy fixity to be imported without requiring continued use of obsolete digest algorithms. +Note that digest algorithms can generate identical digests for different file content and this is more likely for +simpler and older digests. Implementations must thus expect and handle cases where the fixity block correctly lists +the same digest for different files. ## 2. Storage {: #storage} @@ -671,13 +678,20 @@ checksums - considering some sort of failure has just occurred. ### 4.1 Informative References {: #informative-references} -**\[NAMASTE]** Directory Description with Namaste Tags. J. Kunze.9 November 2009. URL: - +**\[NAMASTE]** Directory Description with Namaste Tags. J. Kunze. 9 November 2009. URL: +, local copy: **\[OAIS]** Reference Model for an Open Archival Information System (OAIS), Issue 2. June 2012. URL: -**\[OCFL-Specification]** OCFL Specification v1.1. URL: +**\[OCFL-Specification]** OCFL Specification. URL: **\[PairTree]** Pairtrees for Object Storage. J. Kunze; M. Haye; E. Hetzner; M. Reyes; C. -Snavely. 12 August 2008\. URL: +Snavely. 12 August 2008\. URL: + +## Revision history + +| Version | Date | Description | +| --- | --- | --- | +| [v1.1.0](https://ocfl.io/1.1.0/implementation_notes/) | 7 October 2022 | First v1.1, [release notes](https://ocfl.io/news/#version-11-of-the-oxford-common-file-layout-ocfl-released), [change log](https://ocfl.io/1.1.0/spec/change-log.html) | +| [v1.1.1](https://ocfl.io/1.1/implementation_notes/) | 7 November 2024 | Clarified [Fixity](#fixity) section, updated links to NAMASTE and Pairtree specifications [release notes](https://ocfl.io/news/#version-111-of-the-oxford-common-file-layout-ocfl-released), [change log](https://ocfl.io/1.1/spec/change-log.html) | diff --git a/1.1/spec/change-log.md b/1.1/spec/change-log.md index d4d159f..88d3026 100644 --- a/1.1/spec/change-log.md +++ b/1.1/spec/change-log.md @@ -5,7 +5,7 @@ no_site_title: true # Oxford Common File Layout Specification v1.1 Change Log {:.no_toc} -7 October 2022 +7 October 2022 for v1.1.0, updated 7 November 2024 for v1.1.1 **Editors:** @@ -21,17 +21,25 @@ License](https://creativecommons.org/licenses/by/4.0/). [OCFL logo: [Patrick Hochstenbach](http://orcid.org/0000-0001-8390-6171) is licensed under [CC BY 2.0](https://creativecommons.org/licenses/by/2.0/). -## Changes from OCFL v1.0 to v1.1 +## Contents -[Version 1.1 of the OCFL Specification](https://ocfl.io/1.1/spec/) is a [minor version](https://semver.org/) update to the [OCFL Specification v1.0](https://ocfl.io/1.0/spec/). The focus is correction and clarification, plus the addition of backwards compatible rules for the specification conformance of prior object versions. +This change log combines logs of changes from [Version 1.0 of the OCFL Specification](https://ocfl.io/1.0/spec/) through +[Version 1.1.1 of the OCFL Specification](https://ocfl.io/1.1/spec/): -### Additions in v1.1 + * [Changes from OCFL v1.0 to v1.1.0](#changes-from-ocfl-v10-to-v110) + * [Changes from OCFL v1.1.0 to v1.1.1](#changes-from-ocfl-v110-to-v111) + +## Changes from OCFL v1.0 to v1.1.0 + +[Version 1.1.0 of the OCFL Specification](https://ocfl.io/1.1.0/spec/) is a [minor version](https://semver.org/) update to the [OCFL Specification v1.0](https://ocfl.io/1.0/spec/). The focus is correction and clarification, plus the addition of backwards compatible rules for the specification conformance of prior object versions. + +### Additions in v1.1.0 #### Add requirements to specification version number sequence Added [Conformance of prior versions](https://ocfl.io/1.1/spec/#conformance-of-prior-versions) section to clarify that existing version directories in an object are immutable and that the specification version number sequence must be monotonic. Adds error code [E103](https://ocfl.io/1.1/spec/#E103). (Issue [#544](https://github.com/OCFL/spec/issues/544)) -### Clarifications in v1.1 +### Clarifications in v1.1.0 #### One conformance declaration per object and storage root @@ -97,7 +105,7 @@ Update the reference to the Bagit specification from the draft . (Issue [#553](https://github.com/OCFL/spec/issues/553)) +Even for minor releases the validations codes may be updated. We have thus moved the `validation-codes.md` file into each version directory so that will be versioned along with the specification. The version of this file for the v1.1 specification is rendered as . (Issue [#553](https://github.com/OCFL/spec/issues/553)) #### Fix E048 description @@ -106,3 +114,31 @@ The E048 error description in `validation-codes.md` is corrected to remove menti #### Fix E070 description The E070 error description in `validation-codes.md` is corrected to refer to `extension` rather than `key` (which was left from an earlier draft). (Issue [#573](https://github.com/OCFL/spec/issues/573)) + +## Changes from OCFL v1.1.0 to v1.1.1 + +[Version 1.1.1 of the OCFL Specification](https://ocfl.io/1.1/spec/) is a [patch version](https://semver.org/) update to the [OCFL Specification v1.1.0](https://ocfl.io/1.1.0/spec/). There are only clarifications. + +### Clarifications in v1.1.1 + +#### Reword filesystem case sensitivity comments + +Version 1.1.0 had an unenforceable MUST regarding filesystem case preservation. The [Filesystem Features](https://ocfl.io/1.1/spec/#filesystem-features) section was changed to instead point out that implementation over filesystems that either do not preserve case or are not case sensitive require great care, including making appropriate choices for file paths and filenames. (Issue [#528](https://github.com/OCFL/spec/issues/528)) + +#### Add range of specification sections covering Object Structure + +The range of specification sections (3.2 through 3.9) specifying the (Object Structure)[#object-structure] was added to make that explicit. (Issue [#637](https://github.com/OCFL/spec/issues/637)) + +#### Clarify description of fixity in Implementation Notes + +The [fixity](https://ocfl.io/1.1/implementation-notes/#fixity) section of the Implementation Notes has been updated to point out differences in requirements for digests used for content addressing (the manifest and state blocks) and fixity (the fixity block). The section also notes that fixity algorithms may generate the same value for different file content. (Issue [#629](https://github.com/OCFL/spec/issues/629)) + +#### Update links to Pairtree and NAMASTE specifications + +Links to both the Pairtree and NAMASTE specifications have been updated in the [Specification references](https://ocfl.io/1.1/spec/#references) and the [Implementation Notes references](https://ocfl.io/1.1/implementation-notes/#references). (Issues [#627](https://github.com/OCFL/spec/issues/627) and [#629](https://github.com/OCFL/spec/issues/629#issuecomment-1623865455)) + +### Corrections to validation codes + +#### Unenforceable code E091 removed + +The E091 code "Filesystems MUST preserve the case of OCFL filepaths and filenames" was unenforceable and was removed as part of rewording the case sensitivity advice. (Issue [#528](https://github.com/OCFL/spec/issues/528)) diff --git a/1.1/spec/index.md b/1.1/spec/index.md index cc31005..92a1f7a 100644 --- a/1.1/spec/index.md +++ b/1.1/spec/index.md @@ -1,11 +1,12 @@ --- no_site_title: true --- -OCFL Hand-drive logo +OCFL Hand-drive logo # Oxford Common File Layout Specification {:.no_toc} -7 October 2022 +7 October 2022, updated 7 November 2024 **This Version:** * @@ -15,7 +16,8 @@ no_site_title: true **Editors:** -* [Neil Jefferies](https://orcid.org/0000-0003-3311-3741), [Bodleian Libraries, University of Oxford](http://www.bodleian.ox.ac.uk/) +* [Neil Jefferies](https://orcid.org/0000-0003-3311-3741), \ +[Bodleian Libraries, University of Oxford](http://www.bodleian.ox.ac.uk/) * [Rosalyn Metz](https://orcid.org/0000-0003-3526-2230), [Emory University](https://web.library.emory.edu/) * [Julian Morley](https://orcid.org/0000-0003-4176-1933), [Stanford University](https://library.stanford.edu/) * [Simeon Warner](https://orcid.org/0000-0002-7970-7855), [Cornell University](https://www.library.cornell.edu/) @@ -27,13 +29,13 @@ no_site_title: true **Additional Documents:** -* [Implementation Notes](https://ocfl.io/1.1/implementation-notes/) -* [Specification Change Log](https://ocfl.io/1.1/spec/change-log.html) -* [Validation Codes](https://ocfl.io/1.1/spec/validation-codes.html) +* [Implementation Notes](https://ocfl.io/draft/implementation-notes/) +* [Specification Change Log](https://ocfl.io/draft/spec/change-log.html) +* [Validation Codes](https://ocfl.io/draftf/spec/validation-codes.html) * [Extensions](https://github.com/OCFL/extensions/) **Previous Version:** -* +* **Repository:** * [Github](https://github.com/ocfl/spec) @@ -61,8 +63,8 @@ management of digital objects within digital repositories. The OCFL initiative began as a discussion amongst digital repository practitioners to identify well-defined, common, and application-independent file management for a digital repository's persisted objects and represents a specification of -the community’s collective recommendations addressing five primary requirements: completeness, parsability, versioning, -robustness, and storage diversity. +the community’s collective recommendations addressing five primary requirements: completeness, parsability, +versioning, robustness, and storage diversity. #### Completeness {:.no_toc #completeness} @@ -236,7 +238,7 @@ object validation. The structure for an object with one version is shown in the ``` The [OCFL Object Root](#dfn-ocfl-object-root) MUST NOT contain files or -directories other than those specified in the following sections. +directories other than those specified in the following sections (3.2 through 3.9). ### 3.2 Object Conformance Declaration {: #object-conformance-declaration} @@ -328,11 +330,24 @@ supported by a client MUST be ignored by | Digest Algorithm Name | Note | | --- | --- | -| `md5` | Insecure. Use only for legacy fixity values. MD5 algorithm and hex encoding defined by \[[RFC1321](#ref-rfc1321)\]. For example, the `md5` digest of a zero-length bitstream is `d41d8cd98f00b204e9800998ecf8427e`. | -| `sha1` | Insecure. Use only for legacy fixity values. SHA-1 algorithm defined by \[[FIPS-180-4](#ref-fips-180-4)\] and MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. For example, the `sha1` digest of a zero-length bitstream is `da39a3ee5e6b4b0d3255bfef95601890afd80709`. | -| `sha256` | Non-truncated form only; note performance implications. SHA-256 algorithm defined by \[[FIPS-180-4](#ref-fips-180-4)\] and MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. For example, the `sha256` digest of a zero-length bitstream starts `e3b0c44298fc1c149afbf4c8996fb92427ae41e4...` (64 hex digits long). | -| `sha512` | Default choice. Non-truncated form only. SHA-512 algorithm defined by \[[FIPS-180-4](#ref-fips-180-4)\] and MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. For example, the `sha512` digest of a zero-length bitstream starts `cf83e1357eefb8bdf1542850d66d8007d620e405...` (128 hex digits long). | -| `blake2b-512` | Full-length form only, using the 2B variant (64 bit) as defined by \[[RFC7693](#ref-rfc7693)\]. MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. For example, the `blake2b-512` digest of a zero-length bitstream starts `786a02f742015903c6c6fd852552d272912f4740...` (128 hex digits long). | +| `md5` | Insecure. Use only for legacy fixity values. MD5 algorithm and hex encoding defined by +\[[RFC1321](#ref-rfc1321)\]. +For example, the `md5` digest of a zero-length bitstream is `d41d8cd98f00b204e9800998ecf8427e`. | +| `sha1` | Insecure. Use only for legacy fixity values. SHA-1 algorithm defined by \[[FIPS-180-4](#ref-fips-180-4)\] and +MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. +For example, the `sha1` digest of a zero-length bitstream is `da39a3ee5e6b4b0d3255bfef95601890afd80709`. | +| `sha256` | Non-truncated form only; note performance implications. SHA-256 algorithm defined by +\[[FIPS-180-4](#ref-fips-180-4)\] and MUST be encoded using hex (base16) encoding +\[[RFC4648](#ref-rfc4648)\]. For example, the `sha256` digest of a zero-length bitstream starts +`e3b0c44298fc1c149afbf4c8996fb92427ae41e4...` (64 hex digits long). | +| `sha512` | Default choice. Non-truncated form only. SHA-512 algorithm defined by \[[FIPS-180-4](#ref-fips-180-4)\] and +MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. +For example, the `sha512` digest of a zero-length bitstream starts `cf83e1357eefb8bdf1542850d66d8007d620e405...` +(128 hex digits long). | +| `blake2b-512` | Full-length form only, using the 2B variant (64 bit) as defined by \[[RFC7693](#ref-rfc7693)\]. +MUST be encoded using hex (base16) encoding \[[RFC4648](#ref-rfc4648)\]. +For example, the `blake2b-512` digest of a zero-length bitstream starts `786a02f742015903c6c6fd852552d272912f4740...` +(128 hex digits long). | An OCFL Inventory MAY contain a fixity section that can store one or more blocks containing fixity values using multiple digest algorithms. See the [section on fixity](#fixity) below for further details. @@ -666,7 +681,8 @@ defining the arrangement under the storage root. root. Although implementations may require multiple OCFL Storage Roots—that is, several logical or physical volumes, or -multiple "buckets" in an object store—each OCFL Storage Root MUST be independent. +multiple "buckets" in an object store—each OCFL Storage Root MUST be +independent. The following example OCFL Storage Root represents the minimal set of files and folders: @@ -763,8 +779,8 @@ consistent manner. As such, the OCFL does not support the portability of these a OCFL Storage hierarchies. A common use case for links is storage deduplication. OCFL inventories provide a portable method of achieving the same effect by using digests to address content. -3. File paths and filenames in the OCFL are case sensitive. Filesystems MUST -preserve the case of OCFL filepaths and filenames. +3. File paths and filenames in the OCFL are case sensitive. Implementations over filesystems that either do not preserve +case or are not case sensitive require great care, including making appropriate choices for file paths and filenames. 4. Transparent filesystem features such as compression and encryption should be effectively invisible to OCFL operations. Consequently, they should not be expected to be portable. @@ -1317,8 +1333,8 @@ The following example OCFL Object has an extension containing custom content. Th **\[FIPS-180-4]** FIPS PUB 180-4 Secure Hash Standard. U.S. Department of Commerce/National Institute of Standards and Technology. URL: -**\[NAMASTE]** Directory Description with Namaste Tags. J. Kunze.9 November 2009. URL: - +**\[NAMASTE]** Directory Description with Namaste Tags. J. Kunze. 9 November 2009. URL: +, local copy: **\[RFC1321]** The MD5 Message-Digest Algorithm. R. Rivest. IETF. April 1992. Informational. URL: @@ -1359,11 +1375,11 @@ A. Wright; H Andrews.20 September 2018. URL: **\[OAIS]** Reference Model for an Open Archival Information System (OAIS), Issue 2. June 2012. URL: -**\[OCFL-Implementation-Notes]** OCFL Implementation Notes v1.1. URL: - +**\[OCFL-Implementation-Notes]** OCFL Implementation Notes. URL: + **\[PairTree]** Pairtrees for Object Storage. J. Kunze; M. Haye; E. Hetzner; M. Reyes; C. -Snavely. 12 August 2008\. URL: +Snavely. 12 August 2008\. URL: **\[RFC6068]** The 'mailto' URI Scheme. M. Duerst; L. Masinter; J. Zawinski. IETF. October 2010. Proposed Standard. URL: @@ -1373,3 +1389,10 @@ Registrations. S. Leonard. IETF. March 2016. URL: **\[RFC8141]** Uniform Resource Names (URNs). P. Saint-Andre; J. Klensin. IETF. April 2017. Proposed Standard. URL: + +## Revision history + +| Version | Date | Description | +| --- | --- | --- | +| [v1.1.0](https://ocfl.io/1.1.0/spec/) | 7 October 2022 | First v1.1, [release notes](https://ocfl.io/news/#version-11-of-the-oxford-common-file-layout-ocfl-released), [change log](https://ocfl.io/1.1.0/spec/change-log.html) | +| [v1.1.1](https://ocfl.io/1.1.1/spec/) | 7 November 2024 | Clarifications, [release notes](https://ocfl.io/news/#version-111-of-the-oxford-common-file-layout-ocfl-released), [change log](https://ocfl.io/1.1.1/spec/change-log.html) | diff --git a/1.1/spec/validation-codes.md b/1.1/spec/validation-codes.md index 661a754..d28dd1e 100644 --- a/1.1/spec/validation-codes.md +++ b/1.1/spec/validation-codes.md @@ -98,7 +98,6 @@ | E088 | 'An OCFL Storage Root MUST NOT contain directories or sub-directories other than as a directory hierarchy used to store OCFL Objects or for storage root extensions.' | https://ocfl.io/1.1/spec/#E088 | E089 | 'If the preservation of non-OCFL-compliant features is required then the content MUST be wrapped in a suitable disk or filesystem image format which OCFL can treat as a regular file.' | https://ocfl.io/1.1/spec/#E089 | E090 | 'Hard and soft (symbolic) links are not portable and MUST NOT be used within OCFL Storage hierarchies.' | https://ocfl.io/1.1/spec/#E090 -| E091 | 'Filesystems MUST preserve the case of OCFL filepaths and filenames.' | https://ocfl.io/1.1/spec/#E091 | E092 | 'The value for each key in the manifest must be an array containing the content paths of files in the OCFL Object that have content with the given digest.' | https://ocfl.io/1.1/spec/#E092 | E093 | 'Where included in the fixity block, the digest values given must match the digests of the files at the corresponding content paths.' | https://ocfl.io/1.1/spec/#E093 | E094 | 'The value of [the message] key is freeform text, used to record the rationale for creating this version. It must be a JSON string.' | https://ocfl.io/1.1/spec/#E094 diff --git a/draft/implementation-notes/index.md b/draft/implementation-notes/index.md index 609675d..36a3553 100644 --- a/draft/implementation-notes/index.md +++ b/draft/implementation-notes/index.md @@ -6,7 +6,7 @@ no_site_title: true # Implementation Notes, Oxford Common File Layout Specification {:.no_toc} -**Unreleased Latest Editors' Draft** - 7 October 2022 +**Unreleased Latest Editors' Draft** **This Version:** * diff --git a/draft/spec/index.md b/draft/spec/index.md index c5acc5f..08e25f8 100644 --- a/draft/spec/index.md +++ b/draft/spec/index.md @@ -6,7 +6,7 @@ no_site_title: true # Oxford Common File Layout Specification {:.no_toc} -**Unreleased Latest Editors' Draft** - 7 October 2022 +**Unreleased Latest Editors' Draft** **This Version:** * diff --git a/index.md b/index.md index c775958..d2f37a6 100644 --- a/index.md +++ b/index.md @@ -10,6 +10,7 @@ Specifically, the benefits of the OCFL include: * __Storage diversity__, to ensure content can be stored on diverse storage infrastructures including conventional filesystems and cloud object stores ## News + * 2024-11-07: [Version 1.1.1 Update Announcement](/news/#version-111-of-the-oxford-common-file-layout-ocfl-released) * 2024-10-24: [OCFL Editors Workshop at iPres](/news/#ocfl-editors-workshop-at-ipres) * 2023-08-01: [Community Listening Sessions for Version 2 Announcement](/news/#community-listening-sessions-for-version-2-of-the-oxford-common-file-layout) * 2022-10-07: [Version 1.1 Release Announcement](/news/#version-11-of-the-oxford-common-file-layout-ocfl-released) @@ -31,8 +32,8 @@ Specifically, the benefits of the OCFL include: ## Citation -Citable copies of the specification, extensions and fixtures can be found on the -[Zenodo OCFL Community site](https://zenodo.org/communities/ocfl/records?q=&f=resource_type%3Asoftware&l=list&p=1&s=10&sort=newest). +Citable copies of the specification, extensions and fixtures can be found on the +[Zenodo OCFL Community site](https://zenodo.org/communities/ocfl/records?q=&f=resource_type%3Asoftware&l=list&p=1&s=10&sort=newest). ## Previous Releases diff --git a/news/index.md b/news/index.md index f8b5130..1dc20e0 100644 --- a/news/index.md +++ b/news/index.md @@ -1,5 +1,34 @@ # News +## Version 1.1.1 of the Oxford Common File Layout (OCFL) Released + +**7 November 2024** + +The OCFL Editors are pleased to announce version 1.1.1, a patch update to version 1.1 of the Oxford Common File Layout +specification. This revision provides minor clarifications based on implementation experience and community feedback. OCFL +Storage Root or OCFL Object version designations do not reflect patch version and thus remain v1.1. + +### What new information is available? + +The [OCFL Specification v1.1](https://ocfl.io/1.1/spec/) and +[OCFL Implementation Notes v1.1](https://ocfl.io/1.1/implementation-notes/) have both been updated. The accompanying +[Change Log](https://ocfl.io/1.1/spec/change-log.html) details the changes both from version 1.0 to version 1.1.0 and +also to version 1.1.1. It is designed to assist implementers with updates to their implementations. We welcome +your feedback, questions, use cases, and especially details of any implementations or experimentation with OCFL. + +The previous OCFL version 1.1 documents, now designated version 1.1.0, remain available: +[OCFL Specification v1.1.0](https://ocfl.io/1.1.0/spec/), +[OCFL Implementation Notes v1.1.0](https://ocfl.io/1.1.0/implementation-notes/), and +[Change Log for v1.1.0](https://ocfl.io/1.1.0/spec/change-log.html). + +### The OCFL Editors + +Neil Jefferies (Bodleian Libraries, University of Oxford)\ +Rosalyn Metz (Emory University)\ +Julian Morley (Stanford University)\ +Simeon Warner (Cornell University)\ +Andrew Woods (Harvard University) + ## OCFL Editors Workshop at iPres **24 October 2024** @@ -109,19 +138,19 @@ compatible with version 1.0. The OCFL website at [https://ocfl.io](https://ocfl.io), includes the most up to date version of the specification and the implementation notes as well as the latest editors draft. -The [OCFL Specification v1.1](https://ocfl.io/1.1/spec/) defines both OCFL Objects, a simple structure for content and +The [OCFL Specification v1.1](https://ocfl.io/1.1.0/spec/) defines both OCFL Objects, a simple structure for content and a JSON document (`inventory.json`) which provides a straightforward but comprehensive register for the object and versions of its content, and an OCFL Storage Root, an arrangement for how OCFL Objects are laid out on physical storage. It also contains examples illustrating the use of the OCFL, and explanations that ground decisions in prior experience. -The companion [OCFL Implementation Notes v1.1](https://ocfl.io/1.1/implementation-notes/) contains advice for implementing +The companion [OCFL Implementation Notes v1.1](https://ocfl.io/1.1.0/implementation-notes/) contains advice for implementing the specification including recommendations for digital preservation, storage handling, client behaviors, and best practices for dealing with OCFL Objects in motion. -The OCFL Editors are also releasing updated [validation rules](https://ocfl.io/1.1/spec/validation-codes.html) and +The OCFL Editors are also releasing updated [validation rules](https://ocfl.io/1.1.0/spec/validation-codes.html) and additional [fixture objects](https://github.com/OCFL/fixtures) for testing OCFL implementations against the specification. -There is an accompanying [Change Log](https://ocfl.io/1.1/spec/change-log.html) that details the changes from version 1.0 +There is an accompanying [Change Log](https://ocfl.io/1.1.0/spec/change-log.html) that details the changes from version 1.0 to version 1.1. It is designed to assist implementers with updates to their implementations. We welcome your feedback, questions, use cases, and especially details of any implementations or experimentation with OCFL.