Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds HIP for hook parallelism #165

Closed
wants to merge 1 commit into from
Closed

Adds HIP for hook parallelism #165

wants to merge 1 commit into from

Conversation

abaehremc
Copy link

Adds a proposal for the hook parallelism feature.

Signed-off-by: Andrew Baehre abaehre@morningconsult.com

@discreet
Copy link

Bump???

@mark007
Copy link

mark007 commented Mar 2, 2021

Does someone need to now approve of this idea and push towards getting the other code patches in?


## Rationale

The proposed design is to add a flag called "hook-parallelism", which
Copy link
Member

@bacongobbler bacongobbler Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When designing proposals, we commonly refer to the User Profile a proposal is intended to target. It helps us gauge who is the intended audience, what is their expected knowledge of Helm's architecture, and how the proposal fits into Helm's design.

If I'm reading this proposal correctly, this proposal is aimed at the Application Operator. In other words, it written for the person who is running helm test, not the person who wrote the tests themselves (the Application Distributor).

With this design, the Application Distributor does not have any control whether the tests should be run in serial or in parallel. That seems to me like a major design flaw with this proposal. It is common for someone to write a test that MUST be run in parallel (testing a race condition) or not at all (testing a workflow).

I would highly re-consider who is the intended target audience for this proposal and how it fits into Helm's overall design. I would prefer that test hook parallelism is a feature which the Application Distributor can opt into, not the Application Operator. In other words, the Application Distributor MUST have full control over which tests can be run in parallel, and what cannot.

Have a look at Go's testing package and see how they've designed test parallelism into the package. The person running go test has no control over which tests can be run in parallel; they can only control which tests/subtests are run.

https://golang.org/pkg/testing/#T.Parallel

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this point was an excellent call out. I definitely was thinking of this from a much more narrow view of being both the Application Distributor and Application Operator. I've tried to add some blurbs addressing this point, please let me know if you would like any more detail.

@margori
Copy link

margori commented Mar 15, 2021

I really want and need parallel execution of hooks. It would save me several minutes in automatic deploys

@technosophos
Copy link
Member

During the HIP review with @mattfarina, we were unsure whether this HIP was still active. Please reply here, @abaehremc, if you are still planning on working on this HIP.

@abaehremc
Copy link
Author

My apologies, I would like to continue working on this HIP. I will attempt to get to it on Monday (3/29).

Signed-off-by: abaehre <abaehre@morningconsult.com>
@abaehremc
Copy link
Author

Updated!

@helm-bot helm-bot added size/L and removed size/M labels Mar 31, 2021
@abaehremc abaehremc requested a review from bacongobbler April 27, 2021 14:42
not exceeding the value of the flag concurrently. This will allow for users to only use the
flag if they require their hooks to run in parallel. Not including the flag,
or including the flag with a value of 0 or 1 will be the same as running
serially.
Copy link
Member

@bacongobbler bacongobbler May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this design falls to the same failures I mentioned earlier, unfortunately.

Tests are commonly written in two manners:

  1. A test that SHOULD run in parallel (testing race conditions, or benchmark suites)
  2. A test that MUST NOT run in parallel (an end-to-end acceptance test suite)

From your proposal here, tests are always in serial by default. A chart developer can write a test that SHOULD run in parallel with other tests, but by default it'll run in serial. That seems like a major design flaw.

In Go, if a test does not call t.Parallel(), it is run in serial. For example, if I use the following test cases:

func Test1(t *testing.T) {
    fmt.Println("Test1 start")
    time.Sleep(time.Second * 2)
    fmt.Println("Test1 end")
}

func Test2(t *testing.T) {
    fmt.Println("Test2 start")
    time.Sleep(time.Second * 2)
    fmt.Println("Test2 end")
}

func Test3(t *testing.T) {
    fmt.Println("Test3 start")
    time.Sleep(time.Second * 2)
    fmt.Println("Test3 end")
}

I receive the following output (note that tests may run in any order, but they run in serial):

><> go test ./...
Test2 start
Test2 end
Test1 start
Test1 end
Test3 start
Test3 end

If I mark them with t.Parallel:

func Test1(t *testing.T) {
    t.Parallel()
    fmt.Println("Test1 start")
    time.Sleep(time.Second * 2)
    fmt.Println("Test1 end")
}

func Test2(t *testing.T) {
    t.Parallel()
    fmt.Println("Test2 start")
    time.Sleep(time.Second * 2)
    fmt.Println("Test2 end")
}

func Test3(t *testing.T) {
    t.Parallel()
    fmt.Println("Test3 start")
    time.Sleep(time.Second * 2)
    fmt.Println("Test3 end")
}

Running it again will result in the following output:

><> go test ./...
Test1 start
Test3 start
Test2 start
Test2 end
Test1 end
Test3 end

If you execute it with go test -parallel=1, the output will again become sequential.

Test1 start
Test1 end
Test3 start
Test3 end
Test2 start
Test2 end

I would urge you again to look at Go's testing package, refer to the design, and re-apply similar concepts here. The developer MAY have controls to limit the number of cores (in our case, pods) which the tests may run, which MAY force the tests to run in serial. But if the tests are expected to run in parallel by default, they should do so by default.

https://golang.org/pkg/testing/#T.Parallel

See also go help testflag for more information on the --parallel flag.

@PaulFlorea
Copy link

It seems like this ticket has stalled?
I would really like to see this feature go out -- it will save so much time for charts with many pre-upgrade hooks.

@dududko
Copy link

dududko commented Oct 8, 2021

@abaehremc Thank you for addressing the issue with helm hooks. I found out about this issue while implementing pre-install hooks. And it looks like my case differs from everyone else's (or they did not comment on their issue), so I might have a completely different opinion on this topic.

First, I will shortly explain my use case and then the issues with current implementation with helm hooks and how I think this issue could be addressed.

Use-case

I need to run multiple pre-install hooks which consists not only of k8s jobs but require secrets and configmaps. There are two jobs with weight=1 and two jobs with weight=2. Each secret and configmap is used only by unique job, which means that I have 4 logical units: A1, B1 (as per weight=1) and A2 and B2 (as per weight=2).

Issues with current helm hooks architecture

Hooks sorting

First problem I have is that jobs fail to start because the job can be created before the creation of a secret or a configmap.
The cause of issue is the sort implementation. It is based on the name of the hook as opposed to the kind of the k8s resource in the regular helm installation.

The workaround I am using now is to set weight=0 to all A1, A2, B1, B2 secrets and configmaps. In this case all secrets and configmaps are installed before the pre-install k8s jobs.

This issue can also be addressed if helm would sort helm hooks by weight and kind, as opposed to name and weight.

Concurrency

In my case I would like the jobs with the same weight to be executed in parallel. Unfortunately we all know that it is not possible.

Proposal

I propose to use the same workflow in the helm hooks as in the installation of the main resources of a helm release.
Group hooks by weight, then in the ascending weight order apply all k8s resources and wait for successful job completion + apply helm hook delete policy. In this case there is no need to add any special concurrency logic to helm hooks whatsoever, everything will be managed in the standard helm workflow (I believe it will just install each individual k8s resource in the separate goroutine).

Here is the short example of how it can be implemented:

func (cfg *Configuration) execHook(rl *release.Release, hook release.HookEvent, timeout time.Duration) error {

	executingHooks := []*release.Hook{}

	for _, h := range rl.Hooks {
		for _, e := range h.Events {
			if e == hook {
				executingHooks = append(executingHooks, h)
			}
		}
	}
    
    var hooksByWeight [][]*release.Hook
    hooksByWeight = groupHooksByWeight(executingHooks)
    
    for _, weightGroup := range hooksByWeight {
        sorted := sortHooksByKind(result.hooks, InstallOrder)
        
        // apply delete hooks policy
        // todo
        
        // install resources
        if _, err := i.cfg.KubeClient.Create(resources); err != nil {
			return i.failRelease(rel, err)
		}
        
        // wait for jobs
        if err := i.cfg.KubeClient.WaitWithJobs(resources, i.Timeout); err != nil {
            return i.failRelease(rel, err)
        }
        
        // apply delete hooks policy
        // todo
    }
    
    
    
    
    return nil
}

I believe that this approach will solve the problem with sequential installation of helm hooks and will improve the maintainability of helm in general.

Does this approach make any since or I went in a completely wrong direction ?
I would appreciate to hear any feedback on this proposal.

@pidj
Copy link

pidj commented Mar 9, 2022

Hello :)
Any news on this change ?
I'm interested by parallelism of same-weight + same-kind resources !

@ktaletsk
Copy link

Hi, I am also interested in this to be implemented. I've migrated the collection of standalone K8s Jobs, which could run in parallel) to Helm Hooks and experiencing regression, because they now run serially.

@bacongobbler
Copy link
Member

It looks like this proposal has stalled out after several rounds of review from core maintainers. If someone wants to carry forward the work, feel free.

@bacongobbler
Copy link
Member

@dududko have you had a chance to look at helm/helm#10768 and the corresponding PR? Seems like your proposal discusses hook weight sorting which is a different topic altogether than what is being discussed here.

JvD-Ericsson added a commit to Nordix/helm that referenced this pull request Feb 9, 2023
Add --hook-parallelism flag to helm deployment hooks but not the testing hook

What this PR does / why we need it:
Adds hook parallelism with a default of 1 as a flag for applicable commands. Does not add parallelism for test hooks as that seems to be the reason why this helm/community#165 was closed but I think there is still value having deployment hooks work in parallel.

Special notes for your reviewer:
Based off of helm#8946 by abaehremc and
helm#7792 by akhilles
JvD-Ericsson added a commit to Nordix/helm that referenced this pull request Feb 9, 2023
Add --hook-parallelism flag to helm deployment hooks but not the testing hook

What this PR does / why we need it:
Adds hook parallelism with a default of 1 as a flag for applicable commands. Does not add parallelism for test hooks as that seems to be the reason why this helm/community#165 was closed but I think there is still value having deployment hooks work in parallel.

Special notes for your reviewer:
Based off of helm#8946 by abaehremc and
helm#7792 by akhilles

Signed-off-by: Jeff van Dam <jeff.van.dam@est.tech>
JvD-Ericsson added a commit to Nordix/helm that referenced this pull request Feb 9, 2023
Add --hook-parallelism flag to helm deployment hooks but not the testing hook

What this PR does / why we need it:
Adds hook parallelism with a default of 1 as a flag for applicable commands. Does not add parallelism for test hooks as that seems to be the reason why this helm/community#165 was closed but I think there is still value having deployment hooks work in parallel.

Special notes for your reviewer:
Based off of helm#8946 by abaehremc and
helm#7792 by akhilles

Signed-off-by: Jeff van Dam <jeff.van.dam@est.tech>
JvD-Ericsson added a commit to Nordix/helm that referenced this pull request Feb 9, 2023
Add --hook-parallelism flag to helm deployment hooks but not the testing hook

What this PR does / why we need it:
Adds hook parallelism with a default of 1 as a flag for applicable commands. Does not add parallelism for test hooks as that seems to be the reason why this helm/community#165 was closed but I think there is still value having deployment hooks work in parallel.

Special notes for your reviewer:
Based off of helm#8946 by abaehremc and
helm#7792 by akhilles

Signed-off-by: Jeff van Dam <jeff.van.dam@est.tech>
@imaldonado1
Copy link

what happened with this case? I have a yaml with 3 pods running 1 command each one. I want to do something like this: helm test helm test example -n example --parallel 3

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.