-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cue/load: support loading from io/fs.FS #607
Comments
Original reply by @mpvl in cuelang/cue#607 (comment) I'm not entirely sure I understand what you're asking. But It would be useful to use embedding and fs. But we aim to support at least 2 versions back, so that will take a while. Either way, these are implementation details, and overlays should work for all files. It could be that it requires a directory that is overlayed to exist, but also that could be tweaked. I'm not sure what you desired or expected here, so I can't tell what is wrong easily. |
Original reply by @verdverm in cuelang/cue#607 (comment) I'm seeing issues with the overlay, I think because of the I've added some printing, working on a test file and setup, will push something to gerrit later
I've tried various settings for |
Original reply by @verdverm in cuelang/cue#607 (comment) I think I found the issue:
Now, since almost all ...unless the users should be placing
But, given the comment here: https://github.com/cuelang/cue/blob/master/cue/load/config.go#L263 config settings
|
Original reply by @tonyhb in cuelang/cue#607 (comment) Using Overlay with 1.16's FS to embed my packages in my binary and ran into this. Also notice that there are a few filesystem requests when loading packages even if they're in
This breaks a very specific flow: I'm compiling a cue validation binary to WASM, with all cue files in memory via Overaly. However, because import.go always accesses the disk modules break and cannot be used in a webassembly build. |
Original reply by @verdverm in cuelang/cue#607 (comment) There are other places where Cue uses the local FS. In particular, this was a recent bugfix: cuelang/cue@a616925 Cue has a WASM playground, maybe that could help as a reference. The repo for that part is here: https://github.com/cue-sh/playground |
Original reply by @tonyhb in cuelang/cue#607 (comment) Yes, but in particular using overlay in a wasm environment causes an an error here: https://github.com/cuelang/cue/pull/786/files#diff-67213f39b84a013e7db04664cd3ad943b7a237d97b9c6e862d10806ac27def28R418-R420 :) There are several places where a filesystem is still expected even after that bugfix. The PR above fixes the ones that I've ran into in the latest beta. It (should) also enable module imports from overlays by fixing areas of the filesystem code that didn't inspect overlays as it probably should. |
Original reply by @myitcv in cuelang/cue#607 (comment) Building on my comment in cuelang/cue#786 (comment), like @mpvl I'm neither entirely clear what's being asked for, nor do I understand what bug is being reported in this issue. That said, thanks @verdverm and @tonyhb for looking into this, because I think there is a gap here that is nicely filled by The docs for
(I've just mailed https://cue-review.googlesource.com/c/cue/+/9021 to remove that last out-of-date paragraph) Put another way, an overlay, in its current form, exists in order to complement what is on disk, rather than wholesale replace its contents. Therefore I'm unclear that the following expectation is correct:
Indeed it's not possible AFAIU to have an But as I mentioned I think such a "switch" is unnecessary - we can instead add an We should as, part of this change, decide what to do with We could safely make this change today, guarded by build tags - Thoughts on that as a way forward? If so, we would like help implementing this. The only gentle request being that we make this change via Gerrit (because discussion about code, review etc is much easier and more productive). |
I was unable to find any docs in this repo for official go version support, but if the CI/CD testing matrix reflects reality it looks like 1.16.x versions are now the earliest supported, which would allow this change to be made without the added complexity of supporting earlier filesystem APIs cue/.github/workflows/test.yml Lines 35 to 37 in 28f8d47
I am currently in the initial stages of building a cue based utility that needs to embed the go library, which required me to delve a little to deep into the cue/load package and was forced to look for workarounds even for the most initial proof of concept. When looking for examples in production, I stumbled upon the following in the grafana codebase where some serious concerns were expressed about the current loading behavior, which would likely be fixed by the suggestions above. I’ve submitted a potential update via Gerrit: https://review.gerrithub.io/c/cue-lang/cue/+/533307. |
@myitcv, would the pull request above represents a reasonable first step per your comment? Use of the Overlay is so deeply embedded in tests throughout the project, so removing it didn't seem advisable for a first effort (if only to preserve the assurances of the original tests). It does allow use of embed.FS which is a requirement of my project and would allow the use of an in memory filesystem. OverlayFS, which was my refactoring of much of the existing code, could be trivially repurposed for this use; however there is likely a simpler implementation if backward compatibility does not need to be preserved. |
Does this work for output? Such that an import or export would write to the |
@verdverm The io/fs.FS interface is read only which prevents using it for export. However from what I can tell (though I'm relatively new to the codebase) virtualizing the writes is much less critical. Most cue library functions output go values/pointers (rather than the names of created files) which means that virtualized file output isn't core to the internal workings of cue / the ability to embed it as a library. File output generally occurs closer to the edge of the cue libraries. In general, reading is where a lot of cue's complexity (and potential security holes) are because of import resolution. As part of loading a cue module, your entire filesystem hierarchy can be traversed. This isn't really a problem when using it from the cli, where the executing users permissions fully describe the security context you are operating in. For tools where you want to load tenant specific cue files from a single service, you want to be able to control what files get read in, regardless of any other factors. Prior to this patch, os file reads occur in internal functions, often as a fallback for when appropriate arguments are not provided. This can lead to unpredictable / unsecure behavior. From a usability standpoint, allowing other filesystems gives the library user the freedom to use the full cue module loading rules (rather than needing to re-implement it manually) while giving significantly more flexibility as to how those files are provided. |
Does this break current usage?
Module support will likely mean packages from a User's homedir will need to be accessible. This would be similar to how Go has a shared module cache typically outside of the current directory. Would it be better to only use |
@verdverm So for this to be a good fix (and not require two largely parallel implementations), the fs.FS interface needs to be used everywhere since a lot of the reads are in internal functions and not just at the edge. However it is certainly possible to prevent regressions in these default file systems, as any function in the interface can be overridden. Both OverlayFS and OSFS are meant to preserve existing behavior, however it is possible there are tests only in unity that would expose incompatibilities. Cue loader's current support for symlinks is rather unclear, they are mostly mentioned in dead code. The implementation of directory walking in io/fs claims to only support symlinks at the root of the walk. However, the only difference I can find currently is that the current code uses ioutil.ReadDir instead of the default io/fs ReadDir. Regardless we should probably determine a test case or two to prevent regression if symlink traversal is important. If we can agree what constitutes a regression, happy to ensure we don't have one! As for breaking API changes, I didn't remove any existing tests or make any real changes to them (other than having them use safer (non-depreciated) methods when appropriate. So while I'm confident I didn't remove any tested public APIs, there may have been an untested / internally unused one I accidentally removed. Could you let me know which function you saw removed? I think I was unclear about module support, I mean simply the ability to resolve imports relative to a cue.mod file, such that a library developer can write an independently testable cue program and runnable cue program, and load it from a go program from a constrained directory. |
I think I was confused about the breaking API change (reading on mobile) For module support, I'm referring to the future plans and how we should think about this might impact that, as part of any changes now. Here's a reproducer for the symlink issue
|
Thanks so much! I'll fix that up. Really all the changes here are just what is necessary to handle all import / decoder reads from a virtual file system. I don't think it will make much of a difference in how loading of user installed modules would be handled. Likely you would want to give the library user the option (but not the requirement) to specify different filesystems for the user modules and the code they are trying to execute; even though OSFS could handle both for the command line. |
This updates all loading and decoding libraries that require filesystem io to use the io/fs interface. This allows tooling built using the cue libraries to have significantly more control as to what files are accessed during loading. Backward compatability was maintained by refactoring filesystem code into overlayFS: an implementation that preserved overlay behavior while also implementing the io/fs interface. Tests were added to test loading from an embedded filesystem. File parsing and source loading functions that fell back to filesystem reads were deprecated due to unpredictable and potentially insecure io behavior. They were replaced by functions that required the src parameter to be set. Internal consumers of these functions in cue libraries (but not all tests) were migrated to the new functions. Significant numbers of unused internal functions were removed from the load package. To conform to io/fs standard, all paths are used with the standard '/' separatorexcept in the filesystem implementations. Fixes #607 Signed-off-by: Elliot Swart <elliot@swart.org> Change-Id: Id3faeafbbbe211a679e4b75ac2d8635a3c33eb6c Signed-off-by: Elliot Swart <elliot@swart.org> Change-Id: I104f40d97bd295732a20d8254aaf241c88a31cac Signed-off-by: Elliot Swart <elliot@swart.org>
@ElliotSwart another question which came up but I haven't given much thought to...
It looks like a number of functions were introduced / deprecated only in name to support alternative behavior despite the API not changing |
@ElliotSwart i was hoping for someone to take this colossal work :-) Thank you so much for this ! How do you want to cooperate on this ? I can try to test if this will work for my particular use case and provide feedback. How do you prefer to receive feedback in gerrit or in here ? My personal preference is github :-) @myitcv can this PR (assuming it checks all the requirements ) potentially merge after 1.18 is released ? My assumption is that it cannot be before that ? |
@yassinm Currently there are comments being added to the gerrit link above, with lots of housekeeping stuff since I'm new to the codebase, gerrit, etc. Happy to let you know once we get to the later stages. I'm not sure the standard procedure for how people contribute tests, but I love tests so certainly wouldn't want to discourage you! |
@verdverm @yassinm just uploaded the next patch set. @verdverm, your test script now works (at least on my machine). |
@ElliotSwart great work! |
Originally opened by @verdverm in cuelang/cue#607
Is your feature request related to a problem? Please describe.
I would like to embed a module into the overlay. This would be useful for shipping binaries with the schema embedded, especially with go 1.16 supporting file embeds natively.
This currently does not work due to how the overlay and
cue.mod/module.cue
is handled incue/load/...
Another use case would be loading all inputs from the Overlay. This is useful in a server context where one wants to avoid disk usage. This seems to not work because of how the args to
load.Instances
works. It seems one cannot specify files from the Overlay there.Am I missing a configuration setting?
Describe the solution you'd like
The ability to do the above. Guidance and input on implementation.
Additional context
I've traced the first use case to some TODOs in the code. The first is here: https://github.com/cuelang/cue/blob/master/cue/load/config.go#L498
The processing in this area generates absolute paths. The
fileSystem
infs.go
https://github.com/cuelang/cue/blob/master/cue/load/fs.go#L206With Go 1.16, there is a new FS interface, maybe that should be used? If so, maybe that means this feature is delayed until 1.16 is more widely adopted?
Example that is not working as desired or expected:
The text was updated successfully, but these errors were encountered: