Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move chroot building into TaskRunner #2132

Merged
merged 13 commits into from
Jan 6, 2017
Merged

Move chroot building into TaskRunner #2132

merged 13 commits into from
Jan 6, 2017

Conversation

schmichael
Copy link
Member

@schmichael schmichael commented Dec 21, 2016

The largest part of this change is refactoring a allocdir.TaskDir struct out of the allocdir.AllocDir struct. AllocRunner now builds only the alloc dir and defers task dir building to TaskRunner.

This allows TaskRunner to consult the new Driver.FSIsolation() method to inform TaskDir.Build() what kind of filesystem layout the task driver expects.

The executor no longer has access to the AllocDir (or even the new TaskDir) struct. Instead it just knows whether fs isolation is used (chroot) and what its task and log paths are.

Feel free to bikeshed names and API details. I think the basic approach is sound, but I'm sure lots can be tweaked.

Test Changes

Most test changes are mechanical to fix API differences.

I did switch a couple test helper functions to return test-only structs instead of having a number of return values: testDriverContext

It made for an obnoxious number of small changes, but it does make it easier to tweak what test helpers setup in the future.

Future Work

  • GetTaskEnv could be refactored to not take both Node and Config
  • Add a Driver.Cleanup method for cases like where Docker.Prestart downloads an image but is then killed. Cleanup would remove the leaked image.

if r.ctx == nil {
snapshotErrors.Errors = append(snapshotErrors.Errors, fmt.Errorf("alloc_runner snapshot includes a nil context"))
if r.allocDir == nil {
//FIXME Upgrade path?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably keep the context for now, and if the r.allocDir is still nil, create it. That should take care of the upgrade scenario. And remove context in the next release.

err := fmt.Errorf("failed to find task dir metadata for alloc %q task %q",
r.alloc.ID, name)
r.logger.Printf("[ERR] client: %v", err)
mErr.Errors = append(mErr.Errors, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should error out and not do anything else if we can't find the task directory from the alloc dir struct. You will be passing nil otherwise to the TaskRunner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Fixed.

@@ -448,7 +454,11 @@ func (r *AllocRunner) Run() {
continue
}

tr := NewTaskRunner(r.logger, r.config, r.setTaskState, r.ctx, r.Alloc(), task.Copy(), r.vaultClient)
r.allocDirLock.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be cleaner if AllocDir maintains it's own locks internally so that the consumers of the API doesn't have to ensure they are locked appropriately.

Copy link
Member Author

@schmichael schmichael Jan 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly we'd still need this lock because it guards the setting and checking the allocDir field on the AllocRunner.

We might be able to skip acquiring it on this particular line if AllocDir had an internal lock, but I'd rather always guard accesses to a field than try to figure out safe places to skip synchronization and confuse future readers of the code.

Copy link
Contributor

@dadgar dadgar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

cstructs "github.com/hashicorp/nomad/client/structs"
)

type TaskDir struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on the struct

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// create paths on disk.
//
// Call AllocDir.NewTaskDir to create new TaskDirs
func newTaskDir(allocDir, taskName string) *TaskDir {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make these take a logger. Will be helpful in the future

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that during development and removed it... that probably should have been a hint to keep it!

Fixed.

return nil, fmt.Errorf("Could not find task directory for task: %v", d.DriverContext.taskName)
}
pluginLogFile := filepath.Join(taskDir, fmt.Sprintf("%s-executor.out", task.Name))
pluginLogFile := filepath.Join(ctx.TaskDir.Dir, fmt.Sprintf("%s-executor.out", task.Name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you drop the task name as part of the file. It is within the directory anyways

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gladly. Fixed.

@@ -277,30 +272,41 @@ func (e *UniversalExecutor) LaunchCmd(command *ExecCommand) (*ProcessState, erro
return nil, err
}

e.logger.Printf("[DEBUG] executor: XXX 1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😨

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my. Fixed all of these.


// Get the command to be ran
command := driverConfig.Command
if err := validateCommand(command, "args"); err != nil {
d.logger.Printf("[WARN] driver.raw_exec: XXX error validating command")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May wanna scan through the whole PR and clean some of these up

* Refactor AllocDir to have a TaskDir struct per task.
* Drivers expose filesystem isolation preference
* Fix lxc mounting of `secrets/`
AllocRunner's state dropped the Context struct which needs to be
converted to the new AllocDir+TaskDir structs in RestoreState.

TaskRunner added a TaskDirBuilt flag, but it's safe to just let that
default to `false` and rebuild all task dirs once on upgrade.
AllocClientStatus string
AllocClientDescription string
Context *driver.ExecContext

// Context is deprecated and only used to migrate from older releases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the following format for these: // COMPAT: Remove in 0.X.0: <reason for compatibility code and when it can be removed

Consolidate task environment building in GetTaskEnv since it can
determine what kind of filesystem isolation is used.

This means drivers no longer have to manipulate task environment paths.
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants