Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize plan to increase the plan apply throughput #5602

Merged
merged 7 commits into from
Apr 24, 2019
Merged

Conversation

arshjohar
Copy link
Contributor

This PR adds normalization of the plan to commit only the diff for stopped and preempted allocs to the raft log to enable better throughput. It also starts using omitempty on some of the structs during msgpack serialization to omit the empty fields.

return nil, fmt.Errorf("alloc lookup failed: %v", err)
}
if alloc == nil {
continue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing the discussion from
https://github.com/hashicorp/nomad/pull/5407/files/c242adea8786e7541c8c108c342026730b028ff4#diff-ccbd515c67aa55098b48f1106de134aaR4134

There are a few reasons I don't think returning an error is the right approach here:

  1. We do diffs for only stopped or preempted allocs. If they don’t exist, it means they have already been stopped/preempted. If that's the case, it’s fine for the update to not make it to the alloc in that case.
  2. The raft log has already been dispatched at this point, and this code just applies the log to the state store. I'm not sure it'd even be 'correct' to not apply this to the state store, given that we can't remove the raft log entry. Wouldn't the state store be stuck whenever it has to apply this log entry?
  3. It looks like existing code for preempted allocs dealt with this in the same way.
    if existing == nil {

Copy link
Member

@preetapan preetapan Apr 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. should not happen unless the plan applier is acting on stale state store data. We asked about returning an error so that its very visible when this happens. ie. this suggestion was for safety

Re: 2) - returning the error will fail the optimistic apply to the leader's state store in

if snap != nil {
and propagate up to the worker in
if err := w.srv.RPC("Plan.Submit", &req, &resp); err != nil {
, eventually that will fail the evaluation and retry upon which the scheduler can make a new plan with updated information. The state store will always fail on that entry, but a future entry with correct information should succeed. Also see
type PlanFuture interface {
and how we block on the future from raft in
result, err := future.Wait()
.

Re: 3) that logic could be worth changing as well, I will address that in a future PR since I am going to be touching the plan applier again after this PR is merged, you don't need to change that right now though.

nomad/util.go Outdated
// MinVersionPlanNormalization is the minimum version to support the
// normalization of Plan in SubmitPlan, and the denormalization raft log entry committed
// in ApplyPlanResultsRequest
var MinVersionPlanNormalization = version.Must(version.NewVersion("0.9.1"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var MinVersionPlanNormalization = version.Must(version.NewVersion("0.9.1"))
var MinVersionPlanNormalization = version.Must(version.NewVersion("0.9.2"))

Copy link
Member

@preetapan preetapan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work overall

@arshjohar arshjohar merged commit 8d48b77 into master Apr 24, 2019
@endocrimes endocrimes deleted the normalized-plan branch April 25, 2019 13:25
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants