Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: more robust config file modifications in install-deadline-worker #512

Open
wants to merge 1 commit into
base: mainline
Choose a base branch
from

Conversation

jusiskin
Copy link
Contributor

@jusiskin jusiskin commented Dec 17, 2024

What was the problem/requirement? (What/Why)

Background

The worker agent provides a install-deadline-worker CLI command that is used for setting up a worker host. The command accepts configuration settings in its command-line arguments for configuration settings to be applied. Those settings are applied by modifying the worker agent configuration file (docs), which uses the TOML file format.

Problem

The code for modifying the worker agent configuration file used simple pattern matching and replacement. For Linux, the sed command was used. On Windows, Python's re.sub(...) was used.

This implementation is fragile and comes with multiple issues:

  1. Any trailing inline TOML comments, for example:

    [worker]
    farm_id = "farm-123"  # our staging farm

    could cause the pattern matching to fail

  2. If the pattern matching allows for trailing comments, the replacement may not preserve them after modification.

  3. The patterns may be overly-strict, for example:

    -e "s,^# farm_id\s*=\s*\"REPLACE-WITH-WORKER-FARM-ID\"$,farm_id = \"${farm_id}\",g" \
    -e "s,^# fleet_id\s*=\s*\"REPLACE-WITH-WORKER-FLEET-ID\"$,fleet_id = \"${fleet_id}\",g" \

    Which is the root cause of Running install-deadline-worker with a previous farm and fleet configured fails #308

  4. If the config file exists but some settings are not defined or commented out, the configuration cannot be applied. One example is when a new setting is introduced to the config file (feat: configurable session root directory #513). In such cases, the current implementation of install-deadline-worker will not be able to find an existing setting and fail.

What was the solution? (How)

Added tomlkit as a dependency. This is a Python package that allows for modifying TOML documents while preserving style (comments, blank space, ordering, indentation, etc...).

Functionality was added to deadline_worker_agent.config.config_file.ConfigFile to modify settings for a tomlkit document (or alternatively specifying a path to a config file and tomlkit is used to modify it).

The logic for modifying settings has the following algorithm:

flowchart LR

A((Start)) -->|Set| B{Exists?}
A -->|Unset| C{Exists?}
B -->|Exists| D[Modify]
D --> H((Done))
B -->|Commented out| E[Uncomment and set value]
E --> H
B -->|Does not exist| F[Create with doc comments]
F --> H
C -->|Exists| G[Comment out]
G --> H
C -->|Does not exist| H
Loading

This code was integrated into the install-deadline-worker logic. Of particular note, the Linux implementation of install-deadline-worker runs a bash script (install.sh) in a subprocess which handles the installation logic. To integrate with the new Python logic for TOML config file modification, a command-line interface was added to src/deadline_worker_agent/config/__main__.py.

The path to the Python interpreter is passed as an argument to install.sh which is used to invoke this Python command-line interface:

${python_interpreter_path} -m deadline_worker_agent.config ...

What is the impact of this change?

  1. The install-deadline-worker command uses more robust code for modifying the worker agent configuration file and preserving style and comments of the TOML document.
  2. Can use the install-deadline-worker to change the farm and fleet ID of an existing worker agent config file (fixes Running install-deadline-worker with a previous farm and fleet configured fails #308)

How was this change tested?

  1. Integration tests were added for the newly added Python functions and the deadline_worker_agent.config command-line interface. I've expanded the GitHub workflows to also run integration tests on Ubuntu (they were previously just run on Windows)
  2. I ran the worker agent end-to-end tests which succeeded.

Was this change documented?

No, this is mostly fixing our implementation and has more robust behavior for edge cases.

Is this a breaking change?

No


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jusiskin jusiskin added the bug Something isn't working label Dec 17, 2024
@jusiskin jusiskin requested a review from a team as a code owner December 17, 2024 19:52
@jusiskin jusiskin force-pushed the config_file_modification branch 4 times, most recently from 0f8c049 to fa9fa2b Compare December 17, 2024 20:21
@jusiskin jusiskin force-pushed the config_file_modification branch from fa9fa2b to cac7338 Compare December 18, 2024 22:04
@jusiskin jusiskin changed the title fix: TOML-aware configuration changes fix: more robust configuration file modifications in install-deadline-worker Dec 18, 2024
@jusiskin jusiskin changed the title fix: more robust configuration file modifications in install-deadline-worker fix: more robust config file modifications in install-deadline-worker Dec 18, 2024
@jusiskin jusiskin force-pushed the config_file_modification branch from cac7338 to 4c01170 Compare December 19, 2024 17:34
Signed-off-by: Josh Usiskin <56369778+jusiskin@users.noreply.github.com>
@jusiskin jusiskin force-pushed the config_file_modification branch from 4c01170 to e14bf49 Compare December 19, 2024 17:40
@jusiskin
Copy link
Contributor Author

There are "expected checks" (docs) that are still waiting here.

This is because this PR changes the name of the checks (see .github/workflows/code_quality.yml). Once I have review approvals and we are ready to merge this, we will modify the branch protection rules for mainline to reflect the new check names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Running install-deadline-worker with a previous farm and fleet configured fails
1 participant