Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cf): CF forkjoin threading improvements #5071

Merged
merged 5 commits into from
Nov 5, 2020

Conversation

german-muzquiz
Copy link
Contributor

CF provider creates a ForkJoinPool for each accounts defined, with a default max parallelism of 16.
This causes that in case of a large amount of accounts the number of threads increases significantly, for example having 50 accounts means 50 Applications FJP + 50 Routes FJP = 100 x 16 = 1600 threads.

This causes a lot of unesessary overhead because the only threads actually being used at any given moment are the ones used by running caching agents, which means that most of these threads are not used all the time.

The given change moves the ForkjoinPool to be a singleton bean at the provider level, so there's only one pool for all accounts, with a configurable max parallelism. In this way the full capacity of the pool can be used regardless of which agents are running at any given moment.

@zachsmith1
Copy link
Contributor

LGTM -- One note: you will need to create a PR in halyard and update the credentials validator once this PR is merged/versioned. Anytime we touch the credentials in CD, halyard will need the accompanying change

@kevinawoo kevinawoo added the ready to merge Approved and ready for a merge label Nov 5, 2020
@mergify mergify bot added the auto merged Merged automatically by a bot label Nov 5, 2020
@german-muzquiz
Copy link
Contributor Author

@fieldju this is a candidate to be backported to version 1.23: presumably with the introduction of kork-credentials for cloudfoundry in 1.23, there's a period of time when there are more ManagedAccount cloudfoundry objects than real accounts, waiting to be garbage collected. Since each ManagedAccount object creates 2 ForkJoinPool objects with 16 threads each, this causes high spikes of threads while the garbage collector kicks-in, potentially bringing down clouddriver pod.

This is considered a performance regression for cloudfoundry provider in 1.23 because the spikes of threads were not seen at least in 1.20. This is documented in this bug.

Instead of assuming this is caused by the introduction of kork-credentials change and revert that change, this PR already fixes the root cause of the issue by defining a singleton ForkJoinPool for all cloudfoundry accounts.

The implication of backporting this PR will cause the cloudfoundry setting cloudfoundry.accounts[x].maxCapiConnectionsForCache to be ignored in the change from 1.23.3 to 1.23.4.

There is not an easy way to create test cases for performance regressions like this, but I did a manual test with the same configuration reported in the bug and I only see the expected amount of threads controlled by the new setting cloudfoundry.apiRequestParallelism:

image

image

@german-muzquiz german-muzquiz added the backport-candidate Add to PRs to designate release branch patch candidates. label Dec 11, 2020
@fieldju
Copy link
Contributor

fieldju commented Dec 11, 2020

@Mergifyio backport release-1.23.x

@mergify
Copy link
Contributor

mergify bot commented Dec 11, 2020

Command backport release-1.23.x: success

Backports have been created

mergify bot pushed a commit that referenced this pull request Dec 11, 2020
* feat(kubernetes): Send SIGKILL to kubectl

* fix(cf): CF forkjoin threading improvements

(cherry picked from commit 1f7d7b7)
fieldju pushed a commit that referenced this pull request Dec 11, 2020
* feat(kubernetes): Send SIGKILL to kubectl

* fix(cf): CF forkjoin threading improvements

(cherry picked from commit 1f7d7b7)

Co-authored-by: German Muzquiz <35276119+german-muzquiz@users.noreply.github.com>
@fieldju fieldju removed the backport-candidate Add to PRs to designate release branch patch candidates. label Dec 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto merged Merged automatically by a bot ready to merge Approved and ready for a merge target-release/1.24
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants