Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support File Renaming in inventory.jsonld #16

Closed
ahankinson opened this issue May 25, 2018 · 10 comments
Closed

Support File Renaming in inventory.jsonld #16

ahankinson opened this issue May 25, 2018 · 10 comments
Assignees
Milestone

Comments

@ahankinson
Copy link
Contributor

Currently there is an issue with inventory.jsonld file where it does not support file renaming. We need to resolve this.

Relates to OCFL/Use-Cases#26

@julianmorley
Copy link
Contributor

OK, how about something like this - syntax simplified w/i extraneous values omitted and checksums abbreviated for legibility.

{
    "type": "Object",
    "head": "#v6",

	// For validation & object reconsitution, a scan of all version directories
	// MUST contain at least 1 file that matches every checksum here,
	// but we don't actually care what the filename is - just that the content
	// is present.
	
  "checksums": [ 
	"a83e3633",
	"bb123efc",
	"f4abe741",
	"ee983ac4"
	]

   // Here we use forward diffs to construct the object history through
   // various versions. ADD, COPY, RENAME and DELETE actions are demonstrated.
   // The intention is that the most recent inventory file should be capable
   // of re-constituting the object to any prior version level.

  // It presumes that a scan of all the version directories has taken place,
  // and that at least one file that matches every checksum referenced above has been found.

  "versions": [
	{
	"type": "Version",
        "id": "#v1", // v1 initial add of 3 files
	"a83e3633": ["/file1"],
	"bb123efc": ["/file2"],
	"f4abe741": ["/file3"]
	},
	
	{
	"type": "Version",
        "id": "#v2", // v2 copy file2 to file4
	"bb123efc": ["/file2","/file4"],
	},
	
	{
	"type": "Version",
        "id": "#v3", // v3 rename file1 to file5
	"a83e3633": ["/file5"]
	},

	{
	"type": "Version",
        "id": "#v4", // v4 add file6
	"ee983ac4": ["/file6"]
	},
	
	{
	"type": "Version",
        "id": "#v5", // v5 delete file3
	"f4abe741": [""]
	},
	
	{
	"type": "Version",
        "id": "#v6", // v6 delete file4, rename file2 to file7
	"bb123efc": ["/file7"]
	},
  ]
}

@awoods
Copy link
Member

awoods commented May 28, 2018

Thanks, @julianmorley .
This seems like a constructive path forward. 👍

@julianmorley
Copy link
Contributor

I've refined this a touch and created a gist:
https://gist.github.com/julianmorley/9bc5d2ff525fbfc39d80e1fa3e2641a8

Main change is that I've renamed the version objects as deltas, and expressed version as a key/value attribute instead. This is to allow OCFL to support underlying preservation objects (or files) that don't have such a rigid notion of versioning (e.g. Bagit) that might still be revised over time.

@bcail
Copy link
Contributor

bcail commented May 31, 2018

@julianmorley in your #v2, where file2 is copied to file4, would this support storing the duplicate content only once on the filesystem? Or would you have to have the duplicate content stored twice under the different names?

@julianmorley
Copy link
Contributor

@bcail It supports that, yes, but it doesn't enforce it - that depends on the underlying object ontology. Moab, for example, does native de-dupe files across versions. But if the underlying object was Bagit, for example, in two different version directories, all this would do is note that the exact same file shows up twice in the object, in two different locations.

If the tool used to create a new version of a Bagit object that conforms to OCFL is smart, then it could make the contents of v2 be only the changes made - relying on the contents of the Bagit manifest and the OCFL inventory to correctly re-hydrate an object. But that might be a bit of an edge case.

@ahankinson
Copy link
Contributor Author

@ahankinson
Copy link
Contributor Author

ahankinson commented Jun 4, 2018

Propose: Add deltas to a Version and follow @julianmorley's proposal. This will help with deduplication and also to help track the changes that are made within a version.

Structure of deltas is TBD.

Also Propose reversing manifest and members to use paths as keys

@ahankinson
Copy link
Contributor Author

@julianmorley will propose some wording to help make the deltas clearer.

@rosy1280
Copy link
Contributor

rosy1280 commented Jun 4, 2018

@zimeon
Copy link
Contributor

zimeon commented Sep 7, 2018

F2F decision: the adoption of a combination of manifest which maps digests to files in the OCFL Object and the state for each version which maps these digests to logical file paths comprising the complete logical state of the version supports arbitrary renaming. Closing.

@zimeon zimeon closed this as completed Sep 7, 2018
@zimeon zimeon added this to the Alpha milestone Oct 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants