-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document the complete list of requirements for the VFS #28
Conversation
16b6cec
to
d63d7ad
Compare
Adding the list of scenarios to evaluate potential VFS implementations against, copied from #18, as requested here #18 (reply in thread) This list will give us a better idea of which modules can be bundled into an SPA as-is, and which use coding techniques that will need to be modified or special-cased to support the SPA runtime. Static import Globbing scenario
If code within the SPA does naive globbing -- using an off-the-shelf glob library -- then it might end up traversing the VFS. This depends where the VFS is mounted and if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's just one section I feel it is missing... Something regarding ensuring that the VFS respects the paths defined within the application.
What I mean by this is that by having a piece of code that wants to access a file from the VFS (e.g., A simple JSON file), when SEA is building the binary, it needs to ensure that these paths are transformed correctly.
Not sure if this is covered on "No interference with valid paths in the file system", but I want to ensure that if the scenario where a file exists both in the VFS and real-file-system for the same path (e.g.: ./my-file.txt
and if the current working directory also has a my-file.txt
) that the VFS is smart enough to distinguish what should be accessed.
Another essential point that I feel is not covered here are FS permissions. Like, how we deal with scenarios where we don't have access to the actual file system or how we deal with insufficient permissions...
Finally, something mentioned here is how we deal with large VFSs as we don't want to deal with the application being slow on its bootstrap and execution because of random file accesses and the need to unarchive the whole VFS...
BTW, may be worth adding a benchmarking sub-section to the "supported" area, as we should investigate the performance under numerous circumstances and file systems and architectures of the performance and reliability of some of the VFSs we want to support. (Either the ones we're creating, or the ones we think about supporting) |
I believe this is what we're thinking about in arcanis's explanation for how they mount zip files and avoid accidental deep directory traversal, and the thread about why mounting at We could break "no interference" into multiple sub-requirements? Pick scenarios from the list here #28 (comment) appended with both "on the real filesystem" and "on the VFS" For example:
To me, this implies that all VFS paths are known at build-time and somehow transformed. So dynamic |
In the spirit of #28 (comment): Should we also add a doc that explains how each VFS works internally? E.g. a primer explaining the characteristics of a Zip file: table of contents at the end for random access, compression is optional but supported, compression can be enabled per-file, can store symlinks. Then the same for the other VFS implementations being considered. Not sure if this already exists and I missed it, or if it should be a separate PR. |
They don’t have to be known at build-time - the fs wrapper can provide privileged access inside the SEA to dynamic paths. |
I don't think they need to be known at build-time, I was just thinking something like a Proxy or something that transparently wraps around |
+1 |
Permissions are usually tied to the group/owner, etc distinction that is specific to one instance of the file system and won't translate to other file systems with a different user setup. That said, I think there is a subset of permissions that can be preserved. Execution is the main one that comes to mind. Tools like ASAR omit all permissions but the executable bit for this reason. The other aspect would be whether a file is readable or writable, but based on various threads so far, I think we agreed that the VFS is read-only, so we can probably ignore this characteristic.
Yeah, this is exactly what I'm hoping for. I wanted to collect all the requirements we want first (without being biased too much by existing solutions), and then use that as the basis for checking which VFS out there, if any, checks the boxes or not.
This sounds very hard to do right, mainly with dynamic requires (which we should support). I didn't study this aspect much, but PKG seems to do it very well, so we should at least try to copy that approach cc @jesec
That would be amazing. Can you open a GitHub Issue for this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. We can merge this and send further PRs to clarify later
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
43f949e
to
6424ccd
Compare
I have added a section about this to the docs but I'm not sure what problems we might face while implementing that. Might be better if you could share that info.
Could you expand on that?
I believe that one is already covered in the "No interference with valid paths in the file system" section.
Added that in but would like to know if there are any gotchas.
Added a "Globbing" section.
Are you referring to the implications our VFS might have when nodejs/node#44004 lands? |
Yea. 😅 (That PR is def a big break-change). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I meant the list as a checklist to evaluate potential VFS implementations. Some are not problems with the current VFS proposals but might be problems with future VFS proposals.
I suppose
If someone proposes that the VFS exist at a I can imagine someone proposing
We can have a couple examples of (contrived) corner-cases such as
Thanks, yeah looks like it's covered. |
|
||
If the bundled files in the VFS correspond to certain paths that already exist | ||
in the real file system, that will be very confusing, so it should use such | ||
paths that cannot be used by existing files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More than being confusing, this would outright break certain use-cases.
I think we all agree that we can officially drop that idea, as introducing a prefix would break a huge amount of things.
I think this is the best summary of the entire problem so far. We need a clear answer to this problem. The scenarios that we need to handle are:
All the cases except from the first one are trivial. For the first one, a potentially good-enough way of solving the dilemma is a precedence rule. We i.e. prefer the VFS location over the outer file system or vice-versa. However, which one we prefer is still a tricky problem, and I can see how it would give us pros and cons both ways. What do you think? Is there an approach other than precedence we would consider, that does not involve inventing an arbitrary prefix for the VFS? |
I think we should do what yarn does for zips: the VFS is available on sub-paths of the executable, as if the executable itself were a directory even though it is not. For example, if the executable is at The executable's own path is the only location on the entire filesystem which is guaranteed not to contain any directories or files. |
Ah, that's brilliant. I love it. I can't think of any drawback to that. @RaisinTen shall we document that approach? |
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
@jviotti does this look okay single-executable/docs/virtual-file-system-requirements.md Lines 64 to 69 in b7f5638
|
@cspotcode I think I've addressed all your points too |
|
||
If a single-executable formatter is run with an argument that is a path to a | ||
file inside the VFS, it should be able to use the `fs` APIs to read, format and | ||
print the formatted contents to `stdout`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section should be rewritten. The goal is not to allow this. The goal is to document this as a known gotcha, so authors of SEAs understand that this might happen unexpectedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the decision we need to make here is: can files within the VFS be referenced from outside the VFS, or only internally?
If we have an approach that disambiguates files inside and outside of the VFS (like the base path approach), then I don't see why we should not allow this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For those fundamental questions, let's try to document them in the README. I think it will be useful to have a concrete trail of fundamental questions & answers somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll start GitHub Discussions on a separate category to track these down, and start a PR documenting the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess most people in #18 were in favor of keeping the paths inside the VFS transparent? I have also changed the globbing part to better reflect that.
@RaisinTen Looks good. My only comment is that rather than saying "It might be better...", say "A possible solution...", etc |
e5ea409
to
b7f5638
Compare
Signed-off-by: Darshan Sen <raisinten@gmail.com>
Signed-off-by: Darshan Sen <raisinten@gmail.com>
@jviotti done, PTAL |
Signed-off-by: Darshan Sen <raisinten@gmail.com>
@addaleax I have also moved the data compression part to the optional section since you mentioned that during the meeting, PTAL. |
@RaisinTen Let's merge it. If there are further comments, we can send further PRs :) |
This change attempts to exhaustively document all the requirements of
the VFS. If you disagree with a point or want to add more requirements
to this list, you are more than welcome!
Fixes: #21
Fixes: #18
Signed-off-by: Darshan Sen raisinten@gmail.com