Skip to content
This repository has been archived by the owner on Aug 12, 2024. It is now read-only.

Introduce upload source and rukpakctl run #417

Merged
merged 7 commits into from
Aug 3, 2022

Conversation

joelanford
Copy link
Member

@joelanford joelanford commented Jun 3, 2022

This PR introduces an upload source and a rukpakctl run command that makes use of the new upload source.

Closes #472

The local source is a great way to help us get up and running quickly with a mechanism that enables direct usage bundle contents to a kubernetes cluster without requiring those bundles to be pushed to a git remote or an image registry. However, the local source's use of configmaps could only get us so far. Configmap size is restricted by etcd configuration (typically 2MB), so the local source could only support bundles up to 2MB in size. Some improvements for the local source are discussed in #464.

The upload source does not place this restriction on bundle size because it does not rely on etcd for storage of local bundle content.

An upload source enables bundle developers or other clients to directly upload bundle tar.gz files to inject them into bundles.

This introduces a upload manager service that:

  • accepts uploads for bundles with the upload source type
  • serves those bundles to provisioners that request them
  • garbage collects its local storage when bundle uploads are no longer needed

This also introduces client tooling necessary to actually exercise this functionality in a user-friendly way. It:

  • Adds rukpakctl run to create a BD with an upload source bundle, get the name of the generated bundle, and then upload the bundle contents to the upload manager, which then makes that bundle content available to the bundle provisioners.
  • Adds a simple library for running a port-forward to a specific service in the cluster (in this case, we use it to connect to the upload manager from outside the cluster)
  • A BundleUploader type that is essentially a small API for actually doing the upload. This is used in rukpakctl run and also in the e2e tests.

@joelanford joelanford requested a review from a team as a code owner June 3, 2022 14:57
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 3, 2022
@joelanford
Copy link
Member Author

joelanford commented Jun 3, 2022

Hmm.... I wonder if the inclusion of the binary: {} struct in the Bundle API is causing go-spew (and thus the hashing for name generation) to change the output. I ran the e2e twice and I'm pretty sure this PR's code produced the same (incorrect, according to the tests) hash suffix for the generated bundle.

EDIT: Yeah it sounds like upstream has a similar problem: https://github.com/kubernetes/kubernetes/blob/b7337cb171ab126cb892aacdab2816017a290841/pkg/controller/deployment/util/deployment_util.go#L598-L599

@joelanford joelanford force-pushed the binary-source branch 2 times, most recently from 411d98e to 25c1069 Compare June 4, 2022 01:05
@akihikokuroda
Copy link
Member

About the client tool, can it create the bundle instead of the bundle instance embedding a bundle? So it doesn't need to wait for the bundle name is determined. we can have another command to create the bundle instance that specify the bundle.

@joelanford joelanford force-pushed the binary-source branch 2 times, most recently from 216edec to b79f77a Compare June 17, 2022 14:10
@joelanford
Copy link
Member Author

About the client tool, can it create the bundle instead of the bundle instance embedding a bundle? So it doesn't need to wait for the bundle name is determined. we can have another command to create the bundle instance that specify the bundle.

To actually run the bundle, we have to create a bundle instance that embeds the binary-type bundle. So at least in that case, I don't think we have any choice but to:

  1. Create the BI
  2. Wait for the bundle to exist
  3. Upload the tar.gz for that bundle name

@akihikokuroda
Copy link
Member

@joelanford Thanks! I didn't realize that the bundleinstance has to embed the bundle. I thought it could still reference a bundle.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 24, 2022
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 7, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jul 14, 2022

@joelanford: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 14, 2022
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 15, 2022
@joelanford joelanford changed the title WIP: Introduce binary source Introduce binary source and rukpakctl run Jul 15, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 15, 2022
@joelanford joelanford force-pushed the binary-source branch 4 times, most recently from c2a9fbd to fcc02a5 Compare July 18, 2022 17:26
@joelanford joelanford mentioned this pull request Jul 18, 2022
api/v1alpha1/bundle_types.go Outdated Show resolved Hide resolved
cmd/rukpakctl/cmd/content.go Outdated Show resolved Hide resolved
@joelanford joelanford force-pushed the binary-source branch 2 times, most recently from 204883b to 9c9d013 Compare July 19, 2022 20:03
@joelanford
Copy link
Member Author

joelanford commented Jul 21, 2022

Do we need to update the core webhook to also accommodate this new source type?

Is there anything I'm missing about what we'd need to check? The bundle spec can't change (and there wouldn't be anything really worth changing since binary isn't configurable)

And the upload manager http server forbids re-uploads of already unpacked binary bundles, so I think that covers the immutability from that perspective.

Adding godocs for some of the exported utility packages that were added.

Will do. They're all under the internal tree so I somewhat glossed over that, but I agree nonetheless. docs > no docs regardless of what/where.

@timflannagan
Copy link
Contributor

Is there anything I'm missing about what we'd need to check? The bundle spec can't change (and there wouldn't be anything really worth changing since binary isn't configurable)

This is my bad: I forgot the core webhook just ensures Bundle spec immutability, and we moved the union type validation logic as oneOf CRD schema validations. I was talking about the latter when making that comment.

@joelanford joelanford force-pushed the binary-source branch 6 times, most recently from 11554ed to 69c0ae9 Compare July 22, 2022 16:58
@joelanford joelanford changed the title Introduce binary source and rukpakctl run Introduce upload source and rukpakctl run Jul 25, 2022
@joelanford joelanford force-pushed the binary-source branch 2 times, most recently from dd26000 to 1c113c0 Compare July 26, 2022 18:24
Copy link
Member

@exdx exdx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice job!

)

cmd := &cobra.Command{
Use: "run <bundleDeploymentName> <bundleDir>",
Copy link
Member

@exdx exdx Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about making the bundleDeploymentName optional, and just using the name of the bundle directory if a BD name is not provided? It makes the UX even more easy

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I chose that is to ensure that the pivot workflow works regardless of whether you use the same directory name. If you had ./my-bundle-v0.1.0 and ./my-bundle-v0.2.0, it would be really easy to create 2 different BDs when what you really want is to re-use the same BD, and pivot it from v0.1.0 to v0.2.0.

cmd/rukpakctl/cmd/run.go Show resolved Hide resolved
cmd/rukpakctl/cmd/run.go Show resolved Hide resolved
Comment on lines +13 to +14
// This function unsets user and group information in the tar archive so that readers
// of archives produced by this function do not need to account for differences in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should help alleviate some of the issues we see when attempting to prune red hat catalog images that include root-permission directories. Maybe this is worth highlighting in the user-facing documentation? If it's not overly technical

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, +1 on that recommendation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a little bit technical given the existing way most users would interact with this (rukpakctl run). That subcommand does all the necessary tar.gz stuff under the hood and it is not exposed at all to users.

Regardless, I could definitely see value in documenting the upload manager service as a separate thing because there may be in-cluster processes that can directly interact with the upload manager without requiring the extra port-forward stuff that rukpakctl run provides.

Copy link
Contributor

@timflannagan timflannagan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice work. Lots of complexity here, and I didn't play around with it locally so YMMV, but the packages that were introduced were pretty clean. Overall, I didn't have any blocking feedback, and I'm excited to see this finally get close to the finish line. I tried to avoid making a mass amount of comments, so I stopped midway through in hopes we can merge this before closing out this current release milestone.

Do we need any follow-up tickets around benchmarking this service? Does it make sense to call out in the user-facing documentation (if we haven't already) and/or the CRD schema documentation that this service can be used to handle arbitrarily sized bundles to avoid the etcd size limitation?

Comment on lines 170 to 173
dynCl, err := dynamic.NewForConfig(cfg)
if err != nil {
return "", fmt.Errorf("build dynamic client: %v", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking: create the dynamic client at the driver level, and pass it as a paramter vs. the rest config?

cmd/rukpakctl/cmd/run.go Show resolved Hide resolved
Comment on lines +92 to +95
// NOTE: AddMetricsExtraHandler isn't actually metrics-specific. We can run
// whatever handlers we want on the existing webserver that
// controller-runtime runs when MetricsBindAddress is configured on the
// manager.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


// GetClusterCA returns an x509.CertPool by reading the contents of a Kubernetes Secret. It uses the provided
// client to get the requested secret and then loads the contents of the secret's "ca.crt" key into the cert pool.
func GetClusterCA(ctx context.Context, cl client.Reader, ns, secretName string) (*x509.CertPool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier to pass a NamespacedName instead of individual ns/secretName string fields?

Comment on lines +66 to +82
podName := endpoints.Subsets[0].Addresses[0].TargetRef.Name
port := pf.port.IntVal
if port == 0 {
for _, p := range endpoints.Subsets[0].Ports {
if p.Name == pf.port.StrVal {
port = p.Port
break
}
}
}
if port == 0 {
return fmt.Errorf("could not find port %q for service %q", pf.port.String(), pf.serviceName)
}

path := fmt.Sprintf("/api/v1/namespaces/%s/pods/%s/portforward", pf.serviceNamespace, podName)
host := strings.TrimLeft(pf.cfg.Host, "htps:/")
serverURL := url.URL{Scheme: "https", Path: path, Host: host}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could modularize this logic, but this is fine for now.

internal/rukpakctl/upload.go Show resolved Hide resolved
@@ -90,12 +94,19 @@ func (s *unpacker) Unpack(ctx context.Context, bundle *rukpakv1alpha1.Bundle) (*
// NewDefaultUnpacker returns a new composite Source that unpacks bundles using
// a default source mapping with built-in implementations of all of the supported
// source types.
func NewDefaultUnpacker(mgr ctrl.Manager, namespace, provisionerName, unpackImage string) (Unpacker, error) {
//
// TODO: refactor NewDefaultUnpacker due to growing parameter list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

rukpakv1alpha1.SourceTypeUpload: &Upload{
baseDownloadURL: baseUploadManagerURL,
bearerToken: mgr.GetConfig().BearerToken,
client: http.Client{Timeout: 10 * time.Second, Transport: httpTransport},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any value in a constant variable for the timeout period?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not!

Comment on lines +13 to +14
// This function unsets user and group information in the tar archive so that readers
// of archives produced by this function do not need to account for differences in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, +1 on that recommendation.

Comment on lines 1011 to 1014
if bundle.Status.Phase != rukpakv1alpha1.PhaseUnpacked {
return errors.New("bundle is not unpacked")
}
return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle this through gomega matchers?

@exdx
Copy link
Member

exdx commented Aug 2, 2022

One additional thought I had -- do we ensure that the kube client proxy to the service tears down gracefully after ending the upload? I have seen issues with kubectl port-forward where it would close the stream but the localhost port would still be in use. This would prevent the command from being idempotent.

@joelanford
Copy link
Member Author

do we ensure that the kube client proxy to the service tears down gracefully after ending the upload?

If I understand your question, I think the answer is yes. We start a goroutine in the error group whose sole purpose is to wait for the context to be canceled and then close the stop channel.

https://github.com/joelanford/rukpak/blob/01f7e6f85dec68610bf9d0aa390fb991d9c28418/internal/rukpakctl/portforward.go#L104-L108

Then at the very end of that function we call eg.Wait() to block returning from Start() until all error group routines have completed.

@joelanford
Copy link
Member Author

CRD schema documentation that this service can be used to handle arbitrarily sized bundles to avoid the etcd size limitation?

I didn't call this out specifically in the "upload" source struct GoDoc because the "image" and "git" sources also support arbitrarily sized bundles. Maybe we should instead call out the local source's limitations?

@timflannagan
Copy link
Contributor

Yeah, that sounds reasonable to me. That can be done as a follow-up anyways from my perspective.

@joelanford joelanford merged commit 3b832db into operator-framework:main Aug 3, 2022
@joelanford joelanford deleted the binary-source branch August 3, 2022 21:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a new upload source
5 participants