potential runtimes in kpt fn #2567

mengqiy · 2021-11-03T02:01:28Z

There are at least the following runtimes that we can consider to support in kpt:

Container (already supported)
Exec
Starlark
Remote
Pod in a k8s cluster

The following table compares the pros and cons of these runtimes.

Runtime	Pros	Cons
Container (already supported)	1) Support all languages. 2) Portable. All dependencies are packed in the container. 3) Distribution has been solved by docker and OCI. 4) Hermetic and secure.	1) Performance overhead. 2) Users may need to use docker-in-docker or mount docker socket in CI environments. 3) Docker is no longer free for enterprise. Podman may be a replacement.
Exec	1) Fast. 2) CI environment friendly.	1) Security concerns for functions by third parties. 2) Not portable. Depending on arch and OS. 3) Only support a limited number of languages. Not all language can produce a binary (e.g. TypeScript and Python). 4) Kpt needs to solve the distribution problem.
Starlark (currently supported as a KRM function)	1) Easy to write. 2) Can be built-in in kpt. 3) Possible to load from other starlark scripts from remote.	1) Starlark is not a strongly typed language.
Remote (webhook based)	1) Support all languages. 2) Google can provide managed endpoints for the functions in the catalog. 3) Works with CI. 4) Different network protocols (http, gRPC) can be supported.	1) Users need to deploy a server to serve their own functions and it may be a burden to manage. 2) Availability may be a concern.
Pod in a k8s cluster	1) Support all languages. 2) Can avoid docker-in-docker or mounting docker socket in CI environments. 3) No docker required locally.	1) Require a cluster and permissions to create pods. 2) Kubectl is required if using that to launch pods. Alternatively we can use client-go to launch pods directly.

Feedbacks and thoughts are welcome!

mengqiy · 2021-11-03T02:03:00Z

cc: @aodinokov

justinsb · 2021-11-09T21:58:45Z

I think this is a great list. I would suggest that if we put a GRPC/REST interface in front of function execution, a lot of these can look very similar to each other, and we can also amortize the launch time over multiple invocations.

bgrant0607 · 2021-11-10T13:11:20Z

The most commonly used catalog functions could just be built-in, as in kustomize. #1905

Remote functions: The approach is interesting, but I don't foresee offering managed endpoints. Users would need a function platform of some kind to run them.

And, yes, we do need alternatives to docker. Current performance is unusable. #2469

bgrant0607 · 2021-11-10T13:33:41Z

For completeness, we also used to support export to orchestrators: #226

droot · 2021-11-10T17:51:53Z

I would suggest that if we put a GRPC/REST interface in front of function execution, a lot of these can look very similar to each other, and we can also amortize the launch time over multiple invocations.

Do you mean the 4th option ? or something else ?

And, yes, we do need alternatives to docker. Current performance is unusable.

Users who don't want to use docker for (performance or licensing or other reasons), exec or starlark runtime will help them only for their custom functions. But if they need to use common functions such as set-namespace etc from the catalog, then either we will have to provide binaries for catalog functions or provide built-in in kpt. So these three (exec, runtime and built-in) will be required to eliminate the docker dependency completely. Distribution/packaging of binaries or starlark (reusable) and ensuring security is something users will have to solve themselves.

bgrant0607 · 2021-11-10T19:09:44Z

@karlkfi also had some thoughts / concerns

karlkfi · 2021-11-10T22:28:20Z

To summarize my concerns...

Executing functions in containers creates significant limitations:

It requires Docker
1. Docker is an additional dependency
2. Docker now costs money for enterprise users on Mac & Windows
A pipeline of containerized functions doesn't itself containerize well
1. Docker in Docker is fundamentally insecure and disallowed by many enterprise security teams
2. Running kpt server-side requires a VM job scheduling system (like Cloud Build or Jenkins)
It's slow
1. Each function gets its own docker run, with all its overhead
2. Each container has to be downloaded
3. Each container is downloaded synchronously (tho we could pre-pull in parallel)
4. Scheduling a VM for every kpt fn render would be even slower

All of the above make it really hard to use kpt fn render as part of a server-side product.

Possible Solutions:

Make kpt do remote scheduling using K8s pods. This might work, but isn't very Kubernetes-native.
Replace the kpt fn pipeline with a Kubernetes pipeline operator, using a CRD as interface. This would be similar to Argo Workflows, except more specific in purpose.
Replace the kpt fn exec with a Kubernetes function operator, using a CRD as interface. This would be a hybrid of option 1 and 2. It would allow using the existing Kptfile pipeline interface while delegating function execution to Kubernetes without directly making Pods. Both option 2 & 3 could use an intermediate Function interface that handles inputs and outputs (ex: S3/GCS or OCI).
Stop supporting containerized functions all together. Only support in-process (ex: Starlark) or remote execution (ex: gRPC/REST API).

All of the above options allow running kpt fn render in a container, without requiring a VM or a VM scheduler.

justinsb · 2021-11-10T23:41:43Z

I've been playing with alternative execution engines in kpt & kustomize. A sneak-peek is here: https://github.com/justinsb/kpt/commits/runtime_engines I've been operating on the assumption that there is no one-size-fits-all solution, in that different OSes and environments will require different approaches. But I think we can eliminate a lot of the overhead, and there's no need for Docker. Running containers-in-containers is a critical use-case and one of my goals is to understand which of the options work (well) inside a container.

I like @karlkfi 's suggestion of supporting k8s pods as a target. Supporting remote execution via REST/gRPC also makes sense IMO, I have a (more complicated) exploration where we do that even if we launch a container locally and it works well, it does let us amortize the launch cost, though that cost is pretty minimal with direct linux-namespace sandboxing.

For smaller functions, I think the slowest step is now loading the OpenAPI which we do to reattach comments - optimizing OpenAPI would be a big win across the board.

bgrant0607 · 2021-11-11T01:02:33Z

Pluggable engines makes sense to me. These options don't need to be mutually exclusive. I'm looking forward to see whether we can make Starlark simple enough. The typescript sdk isn't bad, and maybe useful in a UI. I also think we should have built-in functions. And "docker" shouldn't be a requirement.

I like the approach of unpacking the containers and basically just exec'ing them in chroots. If cgroups work, that's also fine. The whole pipeline would run in a container. The model is one of shared responsibility, just like running any containers in Kubernetes. Doesn't seem inherently worse than helm charts. At some point I imagine verifying signatures. And perhaps other metadata, along the lines of grafeas.io

When executing functions as part of a service, we'll need to decide what identity to run them with. Launching them on demand makes it easier to use the launcher's identity.

frankfarzan · 2021-11-11T01:52:08Z

Couple of comments/suggestions:

Options provided in potential runtimes in kpt fn #2567 (comment) aren't mutually exclusive. A function container could be run locally (e.g. docker, podman, gvisor?, etc.), could be run on a container scheduler (e.g. as pods in k8s). Remote endpoints is an orthogonal dimension which can be implemented by executing containers as well. Executable configuration pattern (e.g. Starlark or Gatekeeper) can also be implemented by relying on running interpreters in a container, see: https://kpt.dev/book/05-developing-functions/04-executable-configuration.
I wouldn't make decisions based on current performance characteristics because there has been almost zero optimizations done (blocker for v1 release). Why are containers slower than downloading and running compiled binaries (Given catalog functions cache the common base layers of the image)? First order of business is actually optimizing performance of executing containers. Depending on docker is also not a requirement (it was a convenient starting point). See Improve function execution time #2469. /cc @droot

A CUJ we want to enable: I should be able to fetch any off-the-shelf public package and be able to render it with zero set-up. As such, I think executing binaries is worst option. You lose portability and need to effectively re-implement mechanisms provided by containers (e.g. image registry and caching, limiting host network or filesystem access, etc).

karlkfi · 2021-11-16T23:02:13Z

While the options for provders aren't mutually exclusive, the options for what we invest in for our standard package catalog is more constrained. We can't reasonably port every function to every function provider. If only some of the providers work both server-side and client-side we might want to focus our efforts on those.

bgrant0607 · 2021-11-18T03:33:06Z

Ref #2302

negz · 2022-03-02T00:36:47Z

Replace the kpt fn pipeline with a Kubernetes pipeline operator, using a CRD as interface. This would be similar to Argo Workflows, except more specific in purpose.

@karlkfi We have a use case in Crossplane (crossplane/crossplane#2524) in which we'd like a controller to use KRM functions, and were thinking along the same lines as this. ArgoCD lays out a few of the approaches they've explored to execute container pipelines at https://argoproj.github.io/argo-workflows/workflow-executors/. In my mind something like the emissary approach is the most appealing - that is, injecting a 'wrapper' entrypoint into each of a pod's containers to ensure they:

Run in order (though you could also presumably abuse init containers for that).
Pass in stdin and stdout.

One snag is that emissary requires workflows to explicitly specify the command and args for each containerized process in the workflow so that emissary knows what it needs to wrap - presumably the controller could figure this out itself by introspecting the OCI image metadata but that would involve the controller pulling it down (separately from the Kubelet doing so). In the context of KRM functions this could possibly be alleviated by extending the function spec as it exists today to mandate/recommend a predictable entry point (/bin/krm-fn or something).

We really like the idea of a pipeline of containers, since it seems like that has the lowest barrier to entry for our users. We'd like to also support web hooks - probably KRM functions as web hooks - but we're conscious that there's potentially a higher barrier to entry on that approach as the issue here mentions.

kphunter · 2022-03-03T09:16:55Z

I would like to use kpt fn run inside CI/CD pipelines that run in a cluster...
#2158

So as far as I can tell, a pipeline based on Kaniko is incompatible with kpt fn render at the moment, correct? I'd really like to get resource mutation off my local machine.

bzub · 2022-03-11T05:12:28Z

For what it's worth, I've had luck with running buildkitd as a deployment in kubernetes, along with a docker-in-docker deployment/service. If I set DOCKER_HOST in my build stage environments, to point to the docker service, things like kpt work. The result is that I can run docker buildx build, bake etc and it works just like I was running it on my laptop in terms of processes expecting docker availability.

Example here: https://github.com/bzub/kpt-pkgs/blob/37fd0dc305b1463124e19065e353f21a6e501fae/Dockerfile#L28

Update: I've since switched to using exec instead of image for functions in CI/CD: https://github.com/bzub/kpt-pkgs/pull/3/files

bgrant0607 · 2022-07-06T18:22:18Z

We're also exploring WASM: #3299.

bzub · 2022-08-17T20:51:17Z

What about developing a Kptfile frontend for buildkit? buildkitd is easily deployed via dockerd, container, Kubernetes deployment/statefulset, etc. Especially with the buildx tool. The way KRM functions and hydration via kpt work reminds me a lot of a Dockerfile with multiple build stages working on input/output files in series. Also a kpt function runtime option that is trivial to run without root privileges would be huge for me at least, which buildkit seems to deliver.

mikebz · 2022-09-07T20:16:54Z

@droot is this a P0?

mengqiy added the enhancement New feature or request label Nov 3, 2021

mengqiy assigned droot and mengqiy Nov 3, 2021

mengqiy added this to the Q4-2021 milestone Nov 3, 2021

mikebz added triaged Issue has been triaged by adding an `area/` label area/hydrate labels Nov 3, 2021

This was referenced Nov 16, 2021

Bring back --enable-exec for declarative function pipelines. #2302

Closed

kpt fn run should support running containerized fns on K8s without docker #2158

Open

negz mentioned this issue Feb 26, 2022

Proposal: Custom Compositions crossplane/crossplane#2524

Closed

negz mentioned this issue Mar 23, 2022

Prototype rootless container based composition functions crossplane/crossplane#3001

Closed

negz mentioned this issue Apr 30, 2022

Add a design document for 'Composition Functions' crossplane/crossplane#2886

Merged

3 tasks

droot removed the area/hydrate label May 27, 2022

droot added area/fn-runtime KRM function runtime p0 labels May 27, 2022

droot removed their assignment May 27, 2022

bgrant0607 mentioned this issue Jul 6, 2022

WASM as a KRM function runtime #3299

Closed

mortent added this to kpt Jan 21, 2023

mortent moved this to Backlog in kpt Jan 25, 2023

mortent unassigned mengqiy Jan 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

potential runtimes in kpt fn #2567

potential runtimes in kpt fn #2567

mengqiy commented Nov 3, 2021 •

edited

Loading

mengqiy commented Nov 3, 2021

justinsb commented Nov 9, 2021

bgrant0607 commented Nov 10, 2021

bgrant0607 commented Nov 10, 2021

droot commented Nov 10, 2021

bgrant0607 commented Nov 10, 2021

karlkfi commented Nov 10, 2021

justinsb commented Nov 10, 2021

bgrant0607 commented Nov 11, 2021

frankfarzan commented Nov 11, 2021 •

edited

Loading

karlkfi commented Nov 16, 2021

bgrant0607 commented Nov 18, 2021

negz commented Mar 2, 2022

kphunter commented Mar 3, 2022

bzub commented Mar 11, 2022 •

edited

Loading

bgrant0607 commented Jul 6, 2022

bzub commented Aug 17, 2022

mikebz commented Sep 7, 2022

potential runtimes in kpt fn #2567

potential runtimes in kpt fn #2567

Comments

mengqiy commented Nov 3, 2021 • edited Loading

mengqiy commented Nov 3, 2021

justinsb commented Nov 9, 2021

bgrant0607 commented Nov 10, 2021

bgrant0607 commented Nov 10, 2021

droot commented Nov 10, 2021

bgrant0607 commented Nov 10, 2021

karlkfi commented Nov 10, 2021

justinsb commented Nov 10, 2021

bgrant0607 commented Nov 11, 2021

frankfarzan commented Nov 11, 2021 • edited Loading

karlkfi commented Nov 16, 2021

bgrant0607 commented Nov 18, 2021

negz commented Mar 2, 2022

kphunter commented Mar 3, 2022

bzub commented Mar 11, 2022 • edited Loading

bgrant0607 commented Jul 6, 2022

bzub commented Aug 17, 2022

mikebz commented Sep 7, 2022

mengqiy commented Nov 3, 2021 •

edited

Loading

frankfarzan commented Nov 11, 2021 •

edited

Loading

bzub commented Mar 11, 2022 •

edited

Loading