Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust JVM heap size for extremely large memory machine #31567

Merged
merged 8 commits into from
Jun 24, 2024

Conversation

Abacn
Copy link
Contributor

@Abacn Abacn commented Jun 11, 2024

The current setting (70% of max memory) is not efficient for extremely large memory machine.

Internal context: b/340891295#comment14

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@github-actions github-actions bot added the go label Jun 11, 2024
sdks/go.mod Outdated
@@ -20,7 +20,7 @@
// directory.
module github.com/apache/beam/sdks/v2

go 1.20
go 1.21
Copy link
Contributor Author

@Abacn Abacn Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is required for builtin max.

go 1.20 has already eol so we should bump it anyways.

Last time it was July 2023 bumped go 1.19->1.20.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single use of max isn't sufficient cause to bump the minimum SDK version IMO. I'd recommend just writing the standard if instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't think it'll be a problem to do it as a rule, there's just other work that might be good to do when bumping the minimum version as well, and that's very out of scope of this PR.

Basically, bumping the minimum version is a huge change just to avoid a single if statement, compared to writing the conditional.

@github-actions github-actions bot removed the go label Jun 11, 2024
Comment on lines 266 to 268
} else {
return limLarge
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylewise, don't have the else wrapper around a return statement like that, just have the return after the if block.

Copy link
Contributor

@lostluck lostluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved after suggested change.

Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@Abacn
Copy link
Contributor Author

Abacn commented Jun 12, 2024

actually the change has a bug, either in max(,) or assign (size - 32 << 30) to an uint64 will cause unsigned integer overflow.

This caused Dataflow Example PreCommit to fail.

Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @damondouglas for label java.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

// if set_recommended_max_xmx experiment is enabled, sets xmx to 32G. Under 32G
// JVM enables CompressedOops. CompressedOops utilizes memory more efficiently,
// and has positive impact on GC performance and cache hit rate.
func HeapSizeLimit(size uint64) uint64 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here added a unit test responding the change request from internal. However to do so I need to change the method to take an external value.

I try to keep the logic, the effective order of execution the same as before (e.g. syscallx.PhysicalMemorySize won't be called if set_recommended_max_xmx is set), PTAL

t.Errorf("HeapSizeLimit(0). Actual (%d). want 1 GB", lim)
}
if lim := HeapSizeLimit(2 << 30); lim != (2 << 30) * 7 / 10 {
t.Errorf("HeapSizeLimit(0). Actual (%d). want 700 MB", lim)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter wasn't updated in the t.Error call. Here and the last one too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sands for catching it. fixed

@chamikaramj
Copy link
Contributor

This can be merged ?

@Abacn
Copy link
Contributor Author

Abacn commented Jun 24, 2024

thanks, there is an internal counterpart (for legacy worker container) and intended to submit both after 2.57.0 release. We can merge this external one first as it won't get integrated into the ongoing release (the internal one will do for container)

@Abacn Abacn merged commit f64aec2 into apache:master Jun 24, 2024
20 checks passed
@Abacn Abacn deleted the adjustheapsize branch June 24, 2024 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants