WIP: Filesystem Interoperability Notes #227

ahankinson · 2018-10-12T11:57:51Z

A first pass at filesystem interoperability considerations.

Open for comment and review, not for merging (yet)

First pass Refs #212

Follows BagIt's model

ahankinson · 2018-10-12T19:22:17Z

Any thoughts?

awoods · 2018-10-12T20:39:13Z

draft/spec/index.html

+                    integrating files from filesystems that use other encodings.
+                </li>
+                <li>
+                    Some filesystems are not <strong>case sensitive</strong>, meaning two file names that differ only


Can we add a normative MUST, SHOULD, or MAY in this last item?

This seems like a significant question. We either allow a degree of incompatibility or we are rather restrictive. Neither is very appealing. Bagit takes the first approach but gives a description of the issue: https://tools.ietf.org/html/draft-kunze-bagit-17#section-6.1.1.1
so I think I would tend toward that approach. A link to BagIt might be nice here but essentially this is the tack @ahankinson has taken in the PR

zimeon · 2018-10-13T00:28:58Z

draft/spec/index.html

+                    Access Control Lists or Hidden files.
+                </li>
+                <li>
+                    The <strong>character encoding</strong> of the filesystem and Inventory SHOULD be


I don't know what it means to talk about character encodings of a filesytem -- I think our expectation here is bytestream fidelity, without that all is lost! I would thus rephrase this paragraph to talk only about inventory files

Use 'Unicode-compatible' as the language.

If you have filenames that are encoded in a non-unicode compatible way, some transformations will be needed.

zimeon · 2018-10-13T00:35:40Z

draft/spec/index.html

+                </li>
+                <li>
+                    The <strong>character encoding</strong> of the filesystem and Inventory SHOULD be
+                    Unicode-compatible, either UTF-8, UTF-16, or UCS-2. Implementers may experience problems


JSON itself is defined over UTF-8, UTF-16 or UTF-32 (with either byte order for UTF-16 or UTF-32), see https://tools.ietf.org/html/rfc4627#section-3 . I do not know whether all JSON parsers have good support for all of these encodings. There is no note about UCS-2 encoding so I'm not sure how we imagine that to be handled? I feel a little out of my comfort zone with this question but I think we need to be more explicit and tie to the JSON spec. The current text seems to raise more questions than it answers in my mind.

Hi, I'm new here but wanted to add that it may be helpful to be more specific about Unicode Normalization, particularly given the difficulties encountered by folks working on BagIt. (See here).

The example @srerickson raises seems to be when one is which the user was using BagIt on a local system (or perhaps the now defunct Apple Server) which generated the issue. Since OCFL, at least in my mind, would be used by systems not people packaging things up, I can't imagine a situation where this would happen. Are there other cases or systems where this might happen? Its been a while since someone let me on a server so I admit to being ignorant...

@rosy1280 I think the edge case to consider is when a repository is rebuilt on a file system that handles filenames differently than the filesystem the OCFL Object was created on. In that situation (as in the example with BagIt) it might be possible for the filenames in the inventory to differ from the actual filenames (even though they both look the same, visually).

zimeon · 2018-10-13T00:40:33Z

draft/spec/index.html

+                    or "colon" (':') as a path delimiter.
+                </li>
+                <li>
+                    <strong>File permissions</strong> MAY be applied to files in an OCFL Object; however, implementers


File permissions are unavoidably applied to any file in all current filesystems I know about so the MAY here seems odd. I also don't think we should introduce fuzzy terms like ACLs and hidden files. I think we should make a simpler statement along the lines of:

File permissions are not portable across filesystems and are not expected to be preserved by OCFL clients.

I believe that @neilsjefferies has language similar to this and I remember a discussion in the September F2F meeting along these lines. So 👍 to @zimeon 's suggestion

zimeon · 2018-10-13T00:43:59Z

draft/spec/index.html

+                    integrating files from filesystems that use other encodings.
+                </li>
+                <li>
+                    Some filesystems are not <strong>case sensitive</strong>, meaning two file names that differ only


This seems like a significant question. We either allow a degree of incompatibility or we are rather restrictive. Neither is very appealing. Bagit takes the first approach but gives a description of the issue: https://tools.ietf.org/html/draft-kunze-bagit-17#section-6.1.1.1
so I think I would tend toward that approach. A link to BagIt might be nice here but essentially this is the tack @ahankinson has taken in the PR

zimeon · 2018-10-13T00:45:01Z

I wonder whether some of these questions should be elevated to issues for discussion?

neilsjefferies · 2018-10-16T11:18:52Z

Windows is a real pain here. Under the hood, NTFS allows / and \ to be interchangeable directory separators, and it is also case sensitive and Unicode supporting. However, many of its user space tools are differ since they also support FAT variants with their variable handling of these aspects.

neilsjefferies · 2018-10-16T12:24:34Z

Actually, in the light of this (per folder case sensitivity) https://www.windowscentral.com/how-enable-ntfs-treat-folders-case-sensitive-windows-10

...Can we just tell NTFS to go forth and multiply....?

rosy1280 · 2018-11-11T16:28:32Z

I agree with @zimeon if we didn't talk about this in the last Editor's Meeting we should talk about it in the next one.

ahankinson · 2018-12-05T16:38:29Z

Editors call: 05/12/2018 After discussion we'll close this and break specific discussions out to other tickets / PRs.

ahankinson added 3 commits October 12, 2018 12:32

Fixed: Note on filepath restrictions

b18f635

First pass Refs #212

Fixed: Rename section to 'interoperability'

6f2d973

Follows BagIt's model

Further clarification and expansion of filesystem features

747b3cc

ahankinson requested review from zimeon, rosy1280, awoods, julianmorley and neilsjefferies October 12, 2018 11:58

Reference Implementation Notes

b69478b

awoods requested changes Oct 12, 2018

View reviewed changes

zimeon requested changes Oct 13, 2018

View reviewed changes

ahankinson mentioned this pull request Oct 18, 2018

Added notes on Filesystem Features #243

Closed

awoods added the Needs Discussion label Nov 12, 2018

This was referenced Dec 5, 2018

UTF Encoding for OCFL #284

Closed

Case sensitivity and OCFL #285

Closed

ahankinson closed this Dec 5, 2018

awoods deleted the fixed-212-filesystem branch December 7, 2018 18:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Filesystem Interoperability Notes #227

WIP: Filesystem Interoperability Notes #227

ahankinson commented Oct 12, 2018

ahankinson commented Oct 12, 2018

awoods Oct 12, 2018

zimeon Oct 13, 2018

zimeon Oct 13, 2018

ahankinson Nov 21, 2018

ahankinson Nov 21, 2018

zimeon Oct 13, 2018 •

edited

Loading

srerickson Nov 8, 2018

rosy1280 Nov 11, 2018

srerickson Nov 12, 2018

zimeon Oct 13, 2018

rosy1280 Nov 11, 2018

zimeon Oct 13, 2018

zimeon commented Oct 13, 2018

neilsjefferies commented Oct 16, 2018

neilsjefferies commented Oct 16, 2018

rosy1280 commented Nov 11, 2018

ahankinson commented Dec 5, 2018

WIP: Filesystem Interoperability Notes #227

WIP: Filesystem Interoperability Notes #227

Conversation

ahankinson commented Oct 12, 2018

ahankinson commented Oct 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zimeon Oct 13, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zimeon commented Oct 13, 2018

neilsjefferies commented Oct 16, 2018

neilsjefferies commented Oct 16, 2018

rosy1280 commented Nov 11, 2018

ahankinson commented Dec 5, 2018

zimeon Oct 13, 2018 •

edited

Loading