Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply KRM functions in batches #5673

Open
1 of 2 tasks
Homulvas opened this issue Apr 22, 2024 · 3 comments
Open
1 of 2 tasks

Apply KRM functions in batches #5673

Homulvas opened this issue Apr 22, 2024 · 3 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@Homulvas
Copy link

Eschewed features

  • This issue is not requesting templating, unstuctured edits, build-time side-effects from args or env vars, or any other eschewed feature.

What would you like to have added?

Following the suggestion in #5173 we have implemented our own resource transformers. While this generally works we have run into a performance issue where for a big enough target each transformer takes maybe a second or two to process. Now this alone wouldn't be that bad but each additional transformer impacts the runtime linearly. This leads to cases where the time spent applying the transformers is the majority of the whole build.

Why is this needed?

Piping the complete input/output for each KRM function separately is inefficient and makes builds very slow for big enough targets.

Can you accomplish the motivating task without this feature, and if so, how?

One possible workaround is to have all transformations inside a single transformer file. However, this makes the transformers hard to use as you have to treat them with special care.

What other solutions have you considered?

A workaround has been described above.

Anything else we should know?

No response

Feature ownership

  • I am interested in contributing this feature myself! 🎉
@Homulvas Homulvas added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 22, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 22, 2024
@koba1t
Copy link
Member

koba1t commented Jun 24, 2024

Hi @Homulvas
Thanks for submitting the feature request.
I understand what your problem is.

I'm so sorry, but I can't understand what you think about batches at your request.

My guess is that your transformer takes a few minutes to start, and you want to process many yamls all at once.
Isn't that right?

/triage needs-information

@k8s-ci-robot k8s-ci-robot added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 24, 2024
@Homulvas
Copy link
Author

Homulvas commented Jun 25, 2024

I'm so sorry, but I can't understand what you think about batches at your request.

I may have worded the request poorly. The crux of the issue is that a lot of unnecessary I/O is done when there are multiple transformations.

My guess is that your transformer takes a few minutes to start, and you want to process many yamls all at once. Isn't that right?

The issue is that it's always virtually the same yaml. You read a big yaml input, apply a small transformation, write the output. For each separate transformation only the transform step should be repeated.

@koba1t
Copy link
Member

koba1t commented Jun 26, 2024

Sorry, I didn't understand what you want now.
What do you think about the problem?

Do you care about the call overhead of a custom transformer?

I think the current custom transformer interface looks like can batch processing.

// ResourceList is a Kubernetes list type used as the primary data interchange format
// in the Configuration Functions Specification:
// https://github.com/kubernetes-sigs/kustomize/blob/master/cmd/config/docs/api-conventions/functions-spec.md
// This framework facilitates building functions that receive and emit ResourceLists,
// as required by the specification.
type ResourceList struct {
// Items is the ResourceList.items input and output value.
//
// e.g. given the function input:
//
// kind: ResourceList
// items:
// - kind: Deployment
// ...
// - kind: Service
// ...
//
// Items will be a slice containing the Deployment and Service resources
// Mutating functions will alter this field during processing.
// This field is required.
Items []*yaml.RNode `yaml:"items" json:"items"`
// FunctionConfig is the ResourceList.functionConfig input value.
//
// e.g. given the input:
//
// kind: ResourceList
// functionConfig:
// kind: Example
// spec:
// foo: var
//
// FunctionConfig will contain the RNodes for the Example:
// kind: Example
// spec:
// foo: var
FunctionConfig *yaml.RNode `yaml:"functionConfig,omitempty" json:"functionConfig,omitempty"`
// Results is ResourceList.results output value.
// Validating functions can optionally use this field to communicate structured
// validation error data to downstream functions.
Results Results `yaml:"results,omitempty" json:"results,omitempty"`
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

3 participants