Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add section about implementing extensions #31

Merged
merged 5 commits into from
Nov 5, 2020

Conversation

pwinckles
Copy link
Contributor

This is the second breakout from PR #29. It covers how extensions should be implemented.

It includes additional, and perhaps controversial, changes from the same section in PR #29. Specifically, it introduces a new extension manifest file, extensions.json that is located in root extension directories. Per the OCFL spec, it is technically not permissible to have this file in the root extension directory. I think this is a mistake, but a workaround could be to stick it in a subdirectory instead.

Why is the manifest needed? Having the manifest explicitly declare what extensions are in use enables extension chaining extensions and gracefully handles obsoleted extensions or extensions that are no longer used.

In the chaining case, suppose that you have an extension that chains together 3 other extensions. The chaining extension and each of the chained extensions each have their own configuration. So, you have a root extension directory that looks something like this:

extensions/
  - 0001-extension-a /
  - 0002-extension-b/
  - 0003-extension-c/
  - 0004-extension-d/

Without the manifest, a client has to inspect the extensions directory and load whatever it finds. The end result in this case is that 4 unique extensions are loaded.

If it had a manifest that looked like:

{
  "extensions": [
    "0004-extension-d"
  ]
}

Then, it would instead only load a single extension, the chaining extension, which would in turn load its dependent extensions. It is true, this same behavior could be achieved without the manifest but it is more convoluted and requires a second pass to "link" the dependent extensions after the initial load.

In the obsoleted/deprecated case, suppose that you're using an extension and decide to replace it with another or stop using it entirely. In this case, you do not necessarily want to get rid of the extension's configuration. For example, let's say that it's an extension that describes a log format that you're no longer using, but you still have historic files in the log directory that use the format. You probably want to keep a record of the extension that created those files and its configuration, and you can do so if you can have extension configuration within the extensions directory that is not actively used because it is not declared as being in use.

Finally, there is an outstanding problem that this PR still does not address, which is how can you used the same extension configured in multiple different ways on the same repository or object? For example, let's say that there's an extension that's used for specifying id formats, and you have a repository that contains two distinct types of ids. It is currently impossible to use the same extension to define both of the formats, assuming that the extension itself does not internally support this, because the extension can only be configured once.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@rosy1280 rosy1280 added draft and removed in-review labels Oct 20, 2020
README.md Outdated Show resolved Hide resolved
@rosy1280
Copy link
Contributor

Editors met, and we really like this idea but want to deep dive and explore in more detail, only so we can be sure that we get it right and aren't continuing to go back and forth. An issue has been created #33 to help us explore this topic further.

In the mean time we'll leave this PR open and label it as a draft so we can use some (or all) of the language later.

bcail
bcail previously approved these changes Oct 20, 2020
@pwinckles
Copy link
Contributor Author

An alternative extensions.json format would be if it contained the extension configuration itself. For example:

{
  "enabled": {
    "0000-example-extension": {
      "extensionName": "0000-example-extension",
      "firstExampleParameter": 12, 
      "secondExampleParameter": "Hello", 
      "thirdExampleParameter": "Green"
    },
    "0002-flat-direct-storage-layout": {
      "extensionName": "0002-flat-direct-storage-layout"
    }
  }
}

Using this scheme, there would not be individual configuration files for each extension. There would only be a single configuration file in the storage root and one per object that uses extensions. The only reason that extension specific directories under extensions would exist is if the extension needs a place to write files, eg the mutable HEAD extension.

I did not suggest this earlier because I was trying to keep the structure as close to what we've discussed previously as possible. As always, there are advantages and disadvantages to both approaches. I think there is one key advantage to having all of the configuration in a single file, and that is that it reduces the number of files that need to be read when accessing the object. At this point, I think it's unlikely that anyone would ever have more than a couple extensions on their objects, when the configuration is separated into different files there is a minimum of one plus the number of extensions used files to read when accessing an object, the cost of which varies depending on your storage (S3...).

Additional notes about this format:

  • The extension configuration objects must all have a required extensionName key that's value is the extension's registered name. This allows JSON libraries to polymorphically deserialize the objects.
  • In the example, the extension map is under the enabled key. This was in case we wanted to preserve the ability to maintain extension configuration that is no longer in use. For example, you could put inactive configuration under disabled.

README.md Outdated Show resolved Hide resolved
zimeon
zimeon previously approved these changes Nov 3, 2020
@zimeon zimeon added in-review and removed draft labels Nov 3, 2020
awoods
awoods previously approved these changes Nov 3, 2020
rosy1280
rosy1280 previously approved these changes Nov 4, 2020
neilsjefferies
neilsjefferies previously approved these changes Nov 4, 2020
@awoods
Copy link
Member

awoods commented Nov 4, 2020

@pwinckles : it looks like this PR needs a rebase.

@pwinckles
Copy link
Contributor Author

@pwinckles : it looks like this PR needs a rebase.

Fixed

@zimeon zimeon self-requested a review November 4, 2020 17:53
@neilsjefferies neilsjefferies merged commit 9ae94a2 into OCFL:master Nov 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants