Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharp increase in resource usage between v0.9 and v0.10 #642

Closed
antoineco opened this issue Mar 9, 2022 · 10 comments
Closed

Sharp increase in resource usage between v0.9 and v0.10 #642

antoineco opened this issue Mar 9, 2022 · 10 comments

Comments

@antoineco
Copy link
Contributor

In a project with about 60 ko:// references, we are seeing a sudden increase in the resource usage in our CI workers, causing all our release jobs to fail due to go build processed being killed.

Judging by the logs ("Building github.com/triggermesh/triggermesh/cmd/..."), it seems like builds are now started all at once, without respecting nprocs. This wasn't the in v0.9 (again, only judging by the logs).

Before

image

After

image

@imjasonh
Copy link
Member

imjasonh commented Mar 9, 2022

Thanks for reporting this!

Looking through the diff now: v0.9.0...v0.10.0

Are you building multi-platform images (--platform=all, etc.)? #527 changed since v0.9 to build platforms concurrently, which might be what you're seeing.

Aside from that change, the only big change since v0.9 that I could imagine causing this is SBOM generation, but I'd be surprised that it causes this much increased load. It could be worth adding --sbom=none to see if this resolves the issue.

@mattmoor you're also a heavy ko user, are you seeing increased resource usage in v0.10?

@antoineco
Copy link
Contributor Author

@imjasonh No, no building of multi-platform images.
I'll try disabling SBOM generation and report back.

@mattmoor
Copy link
Collaborator

mattmoor commented Mar 9, 2022

No, but I'll ask around 🤔

@antoineco
Copy link
Contributor Author

antoineco commented Mar 10, 2022

Same issue with --sbom=none. The CPU is maxed out and go build processes are being killed by the OOM killer.

Builds seem to be started all at once. I would expect no more than 16 at a time (if hyperthreading is enabled):

2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/xslttransformation-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/azureeventhubstarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/twiliotarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awskinesissource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/webhooksource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/datadogtarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awskinesistarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awssqstarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/transformation-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awslambdatarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/elasticsearchtarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awssnssource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awscomprehendtarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/azureeventhubsource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/azurequeuestoragesource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/zendesksource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/googlecloudfirestoretarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/jiratarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/alibabaosstarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/googlesheettarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/salesforcetarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/logztarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/googlecloudpubsubsource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/sendgridtarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awscloudwatchsource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/azureiothubsource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/tektontarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awsdynamodbtarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/xmltojsontransformation-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/splitter-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/slacktarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awscloudwatchlogssource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/slacksource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awscognitoidentitysource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/ocimetricssource-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/opentelemetrytarget-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awss3target-adapter for linux/amd64
2022/03/10 21:26:11 Building github.com/triggermesh/triggermesh/cmd/awsperformanceinsightssource-adapter for linux/amd64
2022/03/10 21:27:12 Unexpected error running "go build": exit status 1
go build k8s.io/api/core/v1: /usr/local/go/pkg/tool/linux_amd64/compile: signal: killed

@dprotaso
Copy link
Contributor

@antoineco how do you get graphs like that?

I'm curious if we need something like
https://github.com/uber-go/automaxprocs

@antoineco
Copy link
Contributor Author

@dprotaso They are from CircleCI.

I didn't have time to investigate further yet, but I have a feeling parallelism isn't regulated at all anymore. Based solely on the logs output, we used to observe nprocs builds concurrently. Now all builds seem to start at once. And since we have ~60 images to build during releases, it OOMs almost instantly.

@dprotaso
Copy link
Contributor

Playing with Circle CI - by default it reads GOMAXPROCS as 36, but using automaxprocs it calculates it to be 4 (at least for my large instance). So I think using automaxprocs would be beneficial here to figure out the default in a container environment.

But GOMAXPROCS has been in use for quite some time in ko so I wonder if the prior parallelization code wasn't working as expected?

@antoineco
Copy link
Contributor Author

Oh wow 36 for 8 CPU cores, that's unexpectedly high. Thanks for looking into it!

@dprotaso
Copy link
Contributor

dprotaso commented Mar 14, 2022

@antoineco can you test the above change in your CI?

@antoineco
Copy link
Contributor Author

@dprotaso sure thing.
I just did, and it worked like a charm 👍 It seems like ko is now building 16 images in parallel on a worker with 8 CPU cores.
Building TriggerMesh's components was super quick (warm build cache) and the resource usage was very reasonable.

Thanks for the patch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants