Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the local source type #464

Closed
joelanford opened this issue Jul 19, 2022 · 17 comments · Fixed by #611
Closed

Improve the local source type #464

joelanford opened this issue Jul 19, 2022 · 17 comments · Fixed by #611
Assignees
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@joelanford
Copy link
Member

joelanford commented Jul 19, 2022

Getting a simple source implementation in place to support local bundle content is a major quality of life improvement from the current local source that uses a configmap to store bundle contents. However, this implementation has two major drawbacks:

  1. It limits the size of a bundle to the maximum etcd value size (minus overhead), which can be as little as 1MB
  2. It currently only supports plain manifest bundles because configmaps don't directly support directory hierarchies.

To improve the local source type, while still enabling the flexibility of storing bundle contents in configmaps, I would propose a few updates:

  • rename local to configmaps to make it clear that the backing storage is based on the configmap API
  • remove the configmap namespace from the API (i.e. require that bundle configmaps are sourced from the rukpak system namespace). This solves a scaling problem in that provisioners won't have to watch all configmaps in the cluster.
  • change the API to accept multiple configmaps, where each configmap represents a directory within the bundle. Each referenced configmap would include the directory where the configmap should be "mounted" and the name of the configmap
  • Improve the immutability story of this source (e.g. require that referenced configmaps have immutable: true, don't allow configmaps to be deleted if they are referenced by a bundle).

This solution improves on item 1 above in that it effectively changes the bundle size maximum from "1MB for the entire bundle" to "1MB for each file in the bundle". It's still not quite as good as the solution proposed in #472, but it will be a vast improvement over the current local limitation.

This solution completely solves 2 because it enables arbitrary directory hierarchies to be described in the API.

@exdx
Copy link
Member

exdx commented Jul 19, 2022

Added to current milestone

@timflannagan
Copy link
Contributor

Are we putting a line in the sand that we don't want to support local (e.g. ConfigMap) bundle sources, or is this something we can re-evaluate over time? Do we have an idea of any explicit use cases for why users want/need to be able to use ConfigMaps to house bundle contents?

@timflannagan timflannagan added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jul 20, 2022
@exdx
Copy link
Member

exdx commented Jul 20, 2022

I imagine both can be useful -- some users will be more comfortable with configmaps then tar.gz files as a source type. I don't see why adding binary necessarily involves ripping out local.

Furthermore, from a planning perspective, local bundle support was on the radar for a while, triaged, groomed, added to a milestone, and delivered. The binary feature sort of came from nowhere and is removing that local feature in the very next release. I'm not convinced this is optimal.

@timflannagan
Copy link
Contributor

I imagine both can be useful -- some users will be more comfortable with configmaps then tar.gz files as a source type

Yeah, that's a good point. It's possible I'm missing some context on why we're pursuing this implementation though during my brief hiatus from rukpak. Maybe this is something that SDK can immediately hook into as well, which implies that we'd want to pursue a more robust local bundle content implementation?

@exdx
Copy link
Member

exdx commented Jul 20, 2022

That's true, SDK could wrap the binary type and produce a Bundle without any concerns in regards to size.

From a GitOps perspective though, checking in a configmap and a bundle that references it is more transparent than a tar.gz

@joelanford
Copy link
Member Author

joelanford commented Jul 21, 2022

some users will be more comfortable with configmaps then tar.gz files as a source type

Even if we kept the configmap-based source in, I think we'd want to solve problem 2 that I mention in the issue description (configmaps don't natively support directory hierarchy). That's very limiting since most bundle formats will very likely end up evolving to use a directory somewhere.

My perspective is that we're currently trying to solve the problem of "how does a developer work with an arbitrary local bundle?" and I feel like the binary source is an improvement on the local source that solves the same problem.

From a GitOps perspective though

This seems like a different use case, and one that I'm definitely interested in discussing more. I think rukpakctl run is GitOps friendly in that you can just keep running it over and over, but I admittedly was really only focusing on the "how do I get arbitrarily sized bundles into the cluster?" problem.

Perhaps we should step back and enumerate the problems we're trying to solve before we proceed?

@timflannagan
Copy link
Contributor

Another point that we discussed earlier today was the limitation in ConfigMaps where you can't easily mock a directory structure, which may be prohibitive when expanded to other bundle formats.

Should we re-title this issue to be centered around "Introducing the binary source type", so there's some signal in the release notes for what's being added here? That issue title may need some work as there's potential for removal of the source type entirely.

@joelanford joelanford changed the title The local source implementation has a few drawbacks Introduce the binary source type to supersede local source type Jul 21, 2022
@exdx
Copy link
Member

exdx commented Jul 21, 2022

I'm also not a huge fan of the name binary source type-- if we are removing the existing local implementation for this one, does it make sense to title this source type local instead?

@exdx
Copy link
Member

exdx commented Jul 21, 2022

Regarding GitOps, I was referring to the general scenario where you check in some k8s objects into a repository, that then get synced down into the cluster, and the status of those objects is then updated as they get reconciled. rukpakctl run is imperative and more opaque, so it's not exactly the workflow I had in mind.

@joelanford
Copy link
Member Author

joelanford commented Jul 22, 2022

I'm also not a huge fan of the name binary source type

I'm actually not either. I think I picked that up from the OpenShift Build API.

My thought was upload since local perhaps overly implies that the bundle has to come from a local machine, when in fact another cluster component could easily perform the same steps that rukpakctl run does, but from within the cluster.

then get synced down into the cluster

So you're thinking fully kubectl apply-able? In that case, there's really no way around the etcd size limitations unless the Bundle you're applying specifies that the cluster should pull the bundle contents from some external place (e.g. git, image registry, https)

I did notice that the OpenShift Build API does have a configmaps source. And it gets around the directory hierarchy problem like this:

source:
  configmaps:
  - destinationPath: "/"
    configMapName: my-bundle-root
  - destinationPath: "/manifests"
    configMapName: my-bundle-root-manifests
  - destinationPath: "/metadata"
    configMapName: my-bundle-root-metadata

What would folks think about the following?

  1. Rename local to configmaps
  2. Use the above schema, where each referenced configmap can be "mounted" at a directory path from the perspective of the bundle provisioner
  3. Require that all referenced configmaps have immutable: true
  4. Forbid deletion of configmaps referenced in a bundle source. (this prevents the configmap from being deleted and recreated with the same name out from under the bundle)
  5. Perhaps allow multiple configmaps to have the same destinationPath and resolve filename conflicts by always overwriting as the list of configmaps is iterated.

I think this would completely solve the issue of the directory hierarchy AND with item 5, we could essentially end up supporting arbitrary size bundles and arbitrary size bundle directories with the one limitation that no single file can be greater than the etcd value size limit.

@exdx
Copy link
Member

exdx commented Jul 22, 2022

I think the proposed API is reasonable -- slightly more complex, but if we document it it should be fine. If a user just wants to use a plain configmap bundle, they can just use one /manifests destination path.

For item 3 -- this would be done via an update to the core webhook, right? It may be expensive to query for all configmaps, the webhook should probably have to cache configmaps, at least those with a specified label. We can require that label to be present on the configmap on creation.

For item 5 -- this could also potentially just return an error. It's hard to guess what the user's intention is when they have conflicting destinationPaths

@joelanford
Copy link
Member Author

For 5, you're saying allow multiple configmaps to refer to the same directory, but forbid configmaps mounted in the same location from sharing the same key?

@joelanford
Copy link
Member Author

I think 3 would happen in the unpack stage. We'd just have a bundle unpack failure if the underlying configmaps didn't meet the requirements.

I think 4 would require a validating webhook on configmap deletion. I think we should limit bundle configmaps to only be allowed in the rukpak-system namespace, so that would mean the validating webhook could at least be scoped there. If we wanted to further scope to require immutable: true AND that it has a certain label, then we could scope the webhook even further.

@timflannagan
Copy link
Contributor

timflannagan commented Jul 23, 2022

Should we check with the build API team on whether they've run into any issues with this API design? What Joe outlined seems like a reasonable approach to me, although I worry about the additional complexity we're introducing here. Maybe we have some rope though with the API design given the v1alpha1 versioning.

I think I'm favor of this new approach though, and have the added benefit of an easy way to expose resolveset bundle contents through ConfigMap resources (and therefore we also have a record of resolution outcomes?) while providing an escape hatch for bundles that have larger size needs.

And an attempt at putting it all together:

  • I think we'd need to use this "upload" bundle source type for OLM's existing registry+v1 bundle format for any higher level component that generates BundleDeployment resources
  • We could still mock registry+v1 directory structure by specifying the destinationPath field for the manifests and metadata directories if we needed to support this. It's unclear to me what immediate use cases would require this functionality.
  • We can use ConfigMaps for things like resolveset bundles so we have a record of resolution decisions, and pivot to something more solidified once deppy comes into play.

Thoughts? I'm still trying to get up-to-speed on the binary/upload manager aspect, so it's possible I'm missing some context over these past couple of weeks.

@joelanford joelanford changed the title Introduce the binary source type to supersede local source type Improve the local source type Jul 26, 2022
@joelanford
Copy link
Member Author

I repurposed the description of this issue to focus primarily on the improvements we could make to the local (configmap-based) API.

Should we bump this out of the 0.8 milestone?

@timflannagan
Copy link
Contributor

Should we bump this out of the 0.8 milestone?

Works for me 👍

@joelanford joelanford modified the milestones: v0.8.0, v0.9.0 Jul 26, 2022
@timflannagan timflannagan modified the milestones: v0.9.0, v0.10.0 Aug 20, 2022
@timflannagan timflannagan modified the milestones: v0.10.0, v0.11.0 Sep 2, 2022
@tylerslaton tylerslaton removed this from the v0.11.0 milestone Sep 29, 2022
@tylerslaton tylerslaton added this to the backlog milestone Sep 29, 2022
@github-actions
Copy link

This issue has become stale because it has been open 60 days with no activity. The maintainers of this repo will remove this label during issue triage or it will be removed automatically after an update. Adding the lifecycle/frozen label will cause this issue to ignore lifecycle events.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 29, 2022
@joelanford joelanford added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants