Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only include yq in the resulting image when modules require it at runtime #261

Closed
RoyalOughtness opened this issue Nov 11, 2024 · 31 comments
Assignees

Comments

@RoyalOughtness
Copy link
Contributor

RoyalOughtness commented Nov 11, 2024

Currently, all images built with bluebuild will include yq, regardless of whether they need it.

$ rpm -q --whatprovides /usr/bin/yq
file /usr/bin/yq is not owned by any package

yq consistently lags behind on golang versions, meaning it triggers a number of trivy findings if included. So ideally, it would only be included in images that require it at runtime.

@xynydev
Copy link
Member

xynydev commented Nov 12, 2024

I would rather just #216

default-flatpaks I think is currently the only module that requires yq at runtime (@fiftydinar can fact-check me on that)

Refactoring modules to not need yq, removing it completely, and adding Nushell (at least for some of the more complex modules) is part of the plan that I started the discussion for here: #257

Depending on which approach is chosen for including Nushell, conditionally including yq could be implemented the same way.

yq is currently part of stage-bins, so making that conditional could require some static analysis that might not be required with a more elegant refactoring

COPY --from=docker.io/mikefarah/yq /usr/bin/yq /bins/yq

@fiftydinar
Copy link
Contributor

I would rather just #216

I agree.

However, @RoyalOughtness might consider this move as adding additional attack surface, since bash already exists. I would like to know his opinion on this regarding security.

We'll likely only install nushell when necessary, like you mentioned.

default-flatpaks I think is currently the only module that requires yq at runtime (@fiftydinar can fact-check me on that)

Yes, that's correct.

@xynydev
Copy link
Member

xynydev commented Nov 12, 2024

adding additional attack surface, since bash already exists

True. Nushell is probably alright regarding security (based on vibes alone) and I wouldn't imagine it having a similar issue as yq with Golang versions (since it's in Rust lol, but also Rust has better backwards compatibility I think).

@gmpinder
Copy link
Member

Do we store files as YAML at all, or are they stored as JSON? Because if they're stored as JSON, we can probably get away with just installing JQ instead.

@fiftydinar
Copy link
Contributor

Do we store files as YAML at all, or are they stored as JSON? Because if they're stored as JSON, we can probably get away with just installing JQ instead.

According to #216, they are provided as JSON.

I can switch default-flatpaks to use JSON for repo config & jq for parsing.

And with that, I think that we can just install jq if not installed, but it's already installed in Fedora by default.

@gmpinder
Copy link
Member

Perfect! Let's go with this path then.

@gmpinder gmpinder self-assigned this Nov 23, 2024
@gmpinder gmpinder linked a pull request Nov 23, 2024 that will close this issue
@gmpinder
Copy link
Member

Ok there appears to be more than just default-flatpaks module that uses yq. I think how we should handle this is a 3-step process.

  1. Have the CLI switch to using jq wherever possible and make sure it's installed. Make a patch release with this change.
  2. Update all of our modules that use yq to use jq instead.
  3. Update the CLI to remove yq installation and then do another release

How does that sound?

@fiftydinar
Copy link
Contributor

Ok there appears to be more than just default-flatpaks module that uses yq.

What I meant by only default-flatpaks is that it's the only module which uses yq in run-time.
Of course, all modules use yq in build-time at the moment to grab recipe config.

I think how we should handle this is a 3-step process.

  1. Have the CLI switch to using jq wherever possible and make sure it's installed. Make a patch release with this change.
  2. Update all of our modules that use yq to use jq instead.
  3. Update the CLI to remove yq installation and then do another release

How does that sound?

  1. If it's supposed to be the default feature, is new flag even needed? Can it detect dynamically on JSON vs YML, so it uses whatever is needed (which can be removed later, as jq would be used from now on?)
  2. Sounds good.
  3. Sounds good.

@gmpinder
Copy link
Member

If it's supposed to be the default feature, is new flag even needed? Can it detect dynamically on JSON vs YML, so it uses whatever is needed (which can be removed later, as jq would be used from now on?)

I'm not sure what you mean. This is all inside the build, not in the CLI itself. The CLI can handle reading YAML and JSON without the need for yq or jq.

@gmpinder
Copy link
Member

I think Step 3 should be done for the v0.9.0 release since this could be a breaking change for users who have scripts that rely on yq. Moving the modules to jq shouldn't be too disruptive.

@gmpinder
Copy link
Member

Ok, we've got a patch out that ensures that jq is installed. We should be good to start moving to using it in the modules

@fiftydinar
Copy link
Contributor

@gmpinder Should we make a successor function for get_json_array compared to get_yaml_array?

@gmpinder
Copy link
Member

@gmpinder Should we make a successor function for get_json_array compared to get_yaml_array?

Hmm, yeah, that would probably be better for consistency's sake.

@fiftydinar
Copy link
Contributor

@gmpinder Should we make a successor function for get_json_array compared to get_yaml_array?

Hmm, yeah, that would probably be better for consistency's sake.

Added a comment about using jq -r instead of jq, to not include JSON quotes.

New GitHub UI might make my comment harder to see on commit, so typing here.

@gmpinder
Copy link
Member

@gmpinder Should we make a successor function for get_json_array compared to get_yaml_array?

Hmm, yeah, that would probably be better for consistency's sake.

Added a comment about using jq -r instead of jq, to not include JSON quotes.

New GitHub UI might make my comment harder to see on commit, so typing here.

Good point. I will go in and change that.

@gmpinder
Copy link
Member

It's changed and I made a release, however I've come across some bugs that I hadn't noticed before so I'm working on getting those taken care of before I do another patch release. @xynydev @fiftydinar if you can hold off on merging any yq to jq module changes until I get this sorted

@xynydev
Copy link
Member

xynydev commented Nov 26, 2024

One concern; doesn't removing yq and get_yaml_array possibly break custom modules made by users? We'd have to announce this as a breaking change and IMO allowing yq & co to be included somehow for the sake of compatibility (at least in the first releases with this change) would be great.

Edit: apparently both versions are kept now, and will be removed in v0.9.0. This seems sensible.

@gmpinder
Copy link
Member

Ok, I've got the new get_json_array function exported properly and the arm images building AND fixed docker inspection of multi-arch images. We should be good to go for this change in modules.

@gmpinder
Copy link
Member

This prepare for v0.9.0 pr (#275 ) will have the change for removing yq and once the module changes are merged in, the CI runs should start running green.

@xynydev
Copy link
Member

xynydev commented Dec 2, 2024

I see that currently jq is installed with rpm-ostree. Wouldn't it be more compatible to COPY it from the image, like we did with yq? The program has no external dependencies.

@fiftydinar
Copy link
Contributor

fiftydinar commented Dec 2, 2024

I see that currently jq is installed with rpm-ostree. Wouldn't it be more compatible to COPY it from the image, like we did with yq? The program has no external dependencies.

The advantage of this approach is that we'll likely get newer versions of jq faster than Fedora, while with rpm-ostree approach advantage is that it shows in installed packages when running rpm -q jq or rpm -qa, so it's a bit cleaner.

Don't know how COPY in container behaves when file already exists though, like in current situation. It probably overwrites it?

@xynydev
Copy link
Member

xynydev commented Dec 2, 2024

I think it overwrites it, yeah.

@fiftydinar
Copy link
Contributor

jq v1.7.1 is released on 13th December 2023. Official container build exists since then.

1st v1.7.1 version that appeared on Fedora is on 5th January 2024.

so there's a little delay on Fedora's side.

@fiftydinar
Copy link
Contributor

fiftydinar commented Dec 2, 2024

Those are the files that Fedora's spec file copies (ommited documentation & copyright stuff):

%{_bindir}/%{name}
%{_libdir}/libjq.so.*

This lib should be removed to not conflict with container jq

Or we can be safe to not install jq manually with container until it's needed imo.

@xynydev
Copy link
Member

xynydev commented Dec 2, 2024

The lib might be used by something. I wouldn't imagine there's a conflict. We could check if jq is installed and install it with the COPY method only if it's not.

@gmpinder
Copy link
Member

gmpinder commented Dec 2, 2024

The lib might be used by something. I wouldn't imagine there's a conflict. We could check if jq is installed and install it with the COPY method only if it's not.

Too complicated to do since that would require running the image as a container before generating the Containerfile.

@xynydev
Copy link
Member

xynydev commented Dec 2, 2024

Ah, good point. Maybe we can just keep it as-is, which reduces support for non-Fedora distros, but oh well. Or see if the container image includes libjq and install both jq and libjq from there.

@gmpinder
Copy link
Member

gmpinder commented Dec 2, 2024

Those are the files that Fedora's spec file copies (ommited documentation & copyright stuff):

%{_bindir}/%{name}
%{_libdir}/libjq.so.*

This lib should be removed to not conflict with container jq

Or we can be safe to not install jq manually with container until it's needed imo.

If this is a library that is installed from Fedora packaging, then we could inadvertently break someone's build by replacing the jq binary with a newer version. I'm voting on keeping it installed with rpm-ostree for now

@gmpinder
Copy link
Member

gmpinder commented Dec 2, 2024

Ah, good point. Maybe we can just keep it as-is, which reduces support for non-Fedora distros, but oh well. Or see if the container image includes libjq and install both jq and libjq from there.

Currently only Fedora is the supported Distro here, so I honestly don't care. There's been no movement to anything else in months so I'm not going to waste time optimizing for a theoretical

@xynydev
Copy link
Member

xynydev commented Dec 2, 2024

That's a self-fulfilling prophecy, but sure. The reasoning in this case seems solid.

@fiftydinar
Copy link
Contributor

@RoyalOughtness yq is not included anymore, so this issue is fixed & can be closed.

@xynydev xynydev closed this as completed Dec 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants