Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preemption for system jobs #4794

Merged
merged 38 commits into from
Nov 2, 2018
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
5f27e00
structs and API changes to plan and alloc structs needed for preemption
preetapan Sep 10, 2018
8004160
REview feedback
preetapan Sep 10, 2018
bf7192c
Add number of evictions to DesiredUpdates struct to use in CLI/API
preetapan Sep 11, 2018
fc266f7
structs and API changes to plan and alloc structs needed for preemption
preetapan Sep 10, 2018
715d869
Implement preemption for system jobs.
preetapan Sep 21, 2018
b5cbd73
Fix comment
preetapan Sep 24, 2018
13e314c
Fix logic bug, unit test for plan apply method in state store
preetapan Sep 24, 2018
fd6bff2
Fix linting and better comments
preetapan Sep 24, 2018
51c5bae
Fix linting
preetapan Sep 27, 2018
784b96c
Support for new scheduler config API, first use case is to disable pr…
preetapan Sep 28, 2018
2143fa2
Use scheduler config from state store to enable/disable preemption
preetapan Sep 28, 2018
6966e3c
Make preemption config a struct to allow for enabling based on schedu…
preetapan Oct 1, 2018
24b3934
Modify preemption code to use new style of resource structs
preetapan Oct 16, 2018
9f35923
fix end to end scheduler test to use new resource structs correctly
preetapan Oct 17, 2018
a960cce
comments
preetapan Oct 17, 2018
655689a
Preempted allocations should be removed from proposed allocations
preetapan Oct 17, 2018
c4e0e66
Restore/Snapshot plus unit tests for scheduler configuration
preetapan Oct 18, 2018
c3b8e4f
Add fsm layer tests
preetapan Oct 18, 2018
12af2ae
make default config a variable
preetapan Oct 18, 2018
c4a04eb
style fixes
preetapan Oct 18, 2018
191b862
More review comments
preetapan Oct 18, 2018
21432d6
More style and readablity fixes from review
preetapan Oct 18, 2018
4cc21fb
more minor cleanup
preetapan Oct 18, 2018
35635ba
refactor preemption code to use method recievers and setters for comm…
preetapan Oct 29, 2018
8800585
Introduce a response object for scheduler configuration
preetapan Oct 29, 2018
17344a7
Introduce interface with multiple implementations for resource distance
preetapan Oct 30, 2018
f2b0277
Fix return type in tests after refactor
preetapan Oct 30, 2018
22d156f
review comments
preetapan Nov 1, 2018
993b6a2
Cleaner way to exit early, and fixed a couple more places reading fro…
preetapan Nov 1, 2018
3ad7b3f
More review comments
preetapan Nov 1, 2018
06ad182
Plumb alloc resource cache in a few more places.
preetapan Nov 1, 2018
35d31f8
more minor review feedback
preetapan Nov 1, 2018
b3738a0
unit test plan apply with preemptions
preetapan Nov 2, 2018
0015095
Handle static port preemption when there are multiple devices
preetapan Nov 2, 2018
8235919
Fix static port preemption to be device aware
preetapan Nov 2, 2018
1380acb
dereference safely
preetapan Nov 2, 2018
c49a3e2
Fix test setup
preetapan Nov 2, 2018
97cf4e1
Address more minor code review feedback
preetapan Nov 2, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion api/jobs.go
Original file line number Diff line number Diff line change
Expand Up @@ -1024,7 +1024,7 @@ type DesiredUpdates struct {
InPlaceUpdate uint64
DestructiveUpdate uint64
Canary uint64
Evict uint64
Preemptions uint64
}

type JobDispatchRequest struct {
Expand Down
11 changes: 8 additions & 3 deletions nomad/leader.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,12 @@ const (

var minAutopilotVersion = version.Must(version.NewVersion("0.8.0"))

// Default configuration for scheduler with preemption emabled for system jobs
var defaultSchedulerConfig = &structs.SchedulerConfiguration{PreemptionConfig: structs.PreemptionConfig{SystemSchedulerEnabled: true}}
// Default configuration for scheduler with preemption enabled for system jobs
var defaultSchedulerConfig = &structs.SchedulerConfiguration{
PreemptionConfig: structs.PreemptionConfig{
SystemSchedulerEnabled: true,
},
}

// monitorLeadership is used to monitor if we acquire or lose our role
// as the leader in the Raft cluster. There is some work the leader is
Expand Down Expand Up @@ -1237,7 +1241,8 @@ func (s *Server) getOrCreateAutopilotConfig() *structs.AutopilotConfig {
return config
}

// getOrCreateSchedulerConfig is used to get the scheduler config, initializing it if necessary
// getOrCreateSchedulerConfig is used to get the scheduler config. We create a default
// config if it doesn't already exist for bootstrapping an empty cluster
func (s *Server) getOrCreateSchedulerConfig() *structs.SchedulerConfiguration {
state := s.fsm.State()
_, config, err := state.SchedulerConfig()
Expand Down
5 changes: 3 additions & 2 deletions nomad/plan_apply.go
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ func (p *planner) applyPlan(plan *structs.Plan, result *structs.PlanResult, snap
alloc.ModifyTime = now
}

// Set create and modify time for preempted allocs if any
// Set modify time for preempted allocs if any
// Also gather jobids to create follow up evals
preemptedJobIDs := make(map[structs.NamespacedID]struct{})
for _, alloc := range req.NodePreemptions {
Expand Down Expand Up @@ -358,9 +358,10 @@ func evaluatePlanPlacements(pool *EvaluatePool, snap *state.StateSnapshot, plan
}

if nodePreemptions := plan.NodePreemptions[nodeID]; nodePreemptions != nil {
preetapan marked this conversation as resolved.
Show resolved Hide resolved
var filteredNodePreemptions []*structs.Allocation

// Do a pass over preempted allocs in the plan to check
// whether the alloc is already in a terminal state
var filteredNodePreemptions []*structs.Allocation
for _, preemptedAlloc := range nodePreemptions {
preetapan marked this conversation as resolved.
Show resolved Hide resolved
alloc, err := snap.AllocByID(nil, preemptedAlloc.ID)
if err != nil {
Expand Down
115 changes: 115 additions & 0 deletions nomad/plan_apply_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"github.com/hashicorp/nomad/testutil"
"github.com/hashicorp/raft"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)

const (
Expand Down Expand Up @@ -271,6 +272,120 @@ func TestPlanApply_EvalPlan_Simple(t *testing.T) {
}
}

func TestPlanApply_EvalPlan_Preemption(t *testing.T) {
t.Parallel()
state := testStateStore(t)
node := mock.Node()
node.NodeResources = &structs.NodeResources{
Cpu: structs.NodeCpuResources{
CpuShares: 2000,
},
Memory: structs.NodeMemoryResources{
MemoryMB: 4192,
},
Disk: structs.NodeDiskResources{
DiskMB: 30 * 1024,
},
Networks: []*structs.NetworkResource{
{
Device: "eth0",
CIDR: "192.168.0.100/32",
MBits: 1000,
},
},
}
state.UpsertNode(1000, node)

preemptedAlloc := mock.Alloc()
preemptedAlloc.NodeID = node.ID
preemptedAlloc.AllocatedResources = &structs.AllocatedResources{
Shared: structs.AllocatedSharedResources{
DiskMB: 25 * 1024,
},
Tasks: map[string]*structs.AllocatedTaskResources{
"web": {
Cpu: structs.AllocatedCpuResources{
CpuShares: 1500,
},
Memory: structs.AllocatedMemoryResources{
MemoryMB: 4000,
},
Networks: []*structs.NetworkResource{
{
Device: "eth0",
IP: "192.168.0.100",
ReservedPorts: []structs.Port{{Label: "admin", Value: 5000}},
MBits: 800,
DynamicPorts: []structs.Port{{Label: "http", Value: 9876}},
},
},
},
},
}

// Insert a preempted alloc such that the alloc will fit only after preemption
state.UpsertAllocs(1001, []*structs.Allocation{preemptedAlloc})

alloc := mock.Alloc()
alloc.AllocatedResources = &structs.AllocatedResources{
Shared: structs.AllocatedSharedResources{
DiskMB: 24 * 1024,
},
Tasks: map[string]*structs.AllocatedTaskResources{
"web": {
Cpu: structs.AllocatedCpuResources{
CpuShares: 1500,
},
Memory: structs.AllocatedMemoryResources{
MemoryMB: 3200,
},
Networks: []*structs.NetworkResource{
{
Device: "eth0",
IP: "192.168.0.100",
ReservedPorts: []structs.Port{{Label: "admin", Value: 5000}},
MBits: 800,
DynamicPorts: []structs.Port{{Label: "http", Value: 9876}},
},
},
},
},
}
plan := &structs.Plan{
Job: alloc.Job,
NodeAllocation: map[string][]*structs.Allocation{
node.ID: {alloc},
},
NodePreemptions: map[string][]*structs.Allocation{
node.ID: {preemptedAlloc},
},
Deployment: mock.Deployment(),
DeploymentUpdates: []*structs.DeploymentStatusUpdate{
{
DeploymentID: uuid.Generate(),
Status: "foo",
StatusDescription: "bar",
},
},
}
snap, _ := state.Snapshot()

pool := NewEvaluatePool(workerPoolSize, workerPoolBufferSize)
defer pool.Shutdown()

result, err := evaluatePlan(pool, snap, plan, testlog.HCLogger(t))

require := require.New(t)
require.NoError(err)
require.NotNil(result)

require.Equal(result.NodeAllocation, plan.NodeAllocation)
require.Equal(result.Deployment, plan.Deployment)
require.Equal(result.DeploymentUpdates, plan.DeploymentUpdates)
require.Equal(result.NodePreemptions, plan.NodePreemptions)

}

func TestPlanApply_EvalPlan_Partial(t *testing.T) {
t.Parallel()
state := testStateStore(t)
Expand Down
4 changes: 2 additions & 2 deletions nomad/state/state_store.go
Original file line number Diff line number Diff line change
Expand Up @@ -3889,9 +3889,9 @@ func (s *StateStore) SchedulerSetConfig(idx uint64, config *structs.SchedulerCon
return nil
}

// SchedulerCASConfig is used to try updating the scheduler configuration with a
// SchedulerCASConfig is used to update the scheduler configuration with a
// given Raft index. If the CAS index specified is not equal to the last observed index
// for the config, then the call is a noop,
// for the config, then the call is a noop.
func (s *StateStore) SchedulerCASConfig(idx, cidx uint64, config *structs.SchedulerConfiguration) (bool, error) {
tx := s.db.Txn(true)
defer tx.Abort()
Expand Down
32 changes: 1 addition & 31 deletions nomad/structs/structs.go
Original file line number Diff line number Diff line change
Expand Up @@ -7411,37 +7411,6 @@ func (a *Allocation) ComparableResources() *ComparableResources {
}
}

// COMPAT(0.11): Remove in 0.11
// CompatibleNetworkResources returns network resources on the allocation
// by reading AllocatedResources which are populated starting in 0.9 and
// falling back to pre 0.9 fields (Resources/TaskResources) if set
func (a *Allocation) CompatibleNetworkResources() []*NetworkResource {
var ret []*NetworkResource
// Alloc already has 0.9+ behavior
if a.AllocatedResources != nil {
var comparableResources *ComparableResources
for _, taskResource := range a.AllocatedResources.Tasks {
if comparableResources == nil {
comparableResources = taskResource.Comparable()
} else {
comparableResources.Add(taskResource.Comparable())
}
}
ret = comparableResources.Flattened.Networks
} else if a.Resources != nil {
// Alloc has pre 0.9 total resources
ret = a.Resources.Networks
} else if a.TaskResources != nil {
// Alloc has pre 0.9 task resources
resources := new(Resources)
for _, taskResource := range a.TaskResources {
resources.Add(taskResource)
}
ret = resources.Networks
}
return ret
}

// LookupTask by name from the Allocation. Returns nil if the Job is not set, the
// TaskGroup does not exist, or the task name cannot be found.
func (a *Allocation) LookupTask(name string) *Task {
Expand Down Expand Up @@ -8207,6 +8176,7 @@ func (p *Plan) AppendPreemptedAlloc(alloc *Allocation, desiredStatus, preempting
if alloc.AllocatedResources != nil {
newAlloc.AllocatedResources = alloc.AllocatedResources
} else {
preetapan marked this conversation as resolved.
Show resolved Hide resolved
// COMPAT Remove in version 0.11
newAlloc.TaskResources = alloc.TaskResources
newAlloc.SharedResources = alloc.SharedResources
}
Expand Down
Loading