Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: fix panic in system jobs when nodes filtered by class #11565

Merged
merged 1 commit into from
Nov 24, 2021

Conversation

tgross
Copy link
Member

@tgross tgross commented Nov 24, 2021

Fixes #11563

In the system scheduler, if a subset of clients are filtered by class,
we hit a code path where the AllocMetric has been copied, but the
Copy method does not instantiate the various maps. This leads to an
assignment to a nil map. This changeset ensures that the maps are
non-nil before continuing.

The Copy method relies on functions in the helper package that all
return nil slices or maps when passed zero-length inputs. This
changeset to fix the panic bug intentionally defers updating those
functions because it'll have potential impact on memory usage. See
#11564 for more details.

This impacts 1.2.0 and above, so there's no backport required.

@tgross tgross added this to the 1.2.2 milestone Nov 24, 2021
In the system scheduler, if a subset of clients are filtered by class,
we hit a code path where the `AllocMetric` has been copied, but the
`Copy` method does not instantiate the various maps. This leads to an
assignment to a nil map. This changeset ensures that the maps are
non-nil before continuing.

The `Copy` method relies on functions in the `helper` package that all
return nil slices or maps when passed zero-length inputs. This
changeset to fix the panic bug intentionally defers updating those
functions because it'll have potential impact on memory usage. See
#11564 for more details.
@tgross tgross merged commit 036282b into main Nov 24, 2021
@tgross tgross deleted the b-panic-in-system-scheduler branch November 24, 2021 17:28
lgfa29 pushed a commit that referenced this pull request Nov 24, 2021
)

In the system scheduler, if a subset of clients are filtered by class,
we hit a code path where the `AllocMetric` has been copied, but the
`Copy` method does not instantiate the various maps. This leads to an
assignment to a nil map. This changeset ensures that the maps are
non-nil before continuing.

The `Copy` method relies on functions in the `helper` package that all
return nil slices or maps when passed zero-length inputs. This
changeset to fix the panic bug intentionally defers updating those
functions because it'll have potential impact on memory usage. See
#11564 for more details.
@tgross tgross modified the milestones: 1.2.3, 1.2.2 Nov 24, 2021
@github-actions
Copy link

github-actions bot commented Nov 8, 2022

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic in 1.2.0 and 1.2.1 in scheduler for system jobs with class constraints
2 participants