Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination #1633

Merged
merged 8 commits into from
Oct 29, 2019

Conversation

trstringer
Copy link
Contributor

@trstringer trstringer commented Sep 13, 2019

Description

Currently waagent relies on explicit configuration (Provisioning.UseCloudInit to be set, and Provisioning.Enabled to be falsy). This can pose an interesting problem with the introduction of cloud-init. When waagent and cloud-init live side-by-side, they will contend with each other to be the provisioning agent. Image maintainers need to explicitly set configuration (above) so that waagent allows cloud-init to be the provisioning agent. With maintainers possibly not knowing about this behavior, they could unknowing cause provisioning issues.

This PR changes that behavior by deprecating Provisioning.UseCloudInit and Provisioning.Enabled configuration options and introduces a new configuration setting: Provisioning.Agent. The following values change the behavior of which provisioning agent to use:

  • auto (default) - Allow waagent to autodetect to see if cloud-init is enabled. If so, use cloud-init as the provisioning agent. If not, use waagent as the provisioning agent.
  • waagent - Force waagent as the provisioning agent.
  • cloud-init - Force cloud-init as the provisioning agent.
  • disabled - Disable all provisioning.

PR information

  • The title of the PR is clear and informative.
  • There are a small number of commits, each of which has an informative message. This means that previously merged commits do not appear in the history of the PR. For information on cleaning up the commits in your pull request, see this page.
  • Except for special cases involving multiple contributors, the PR is started from a fork of the main repository, not a branch.
  • If applicable, the PR references the bug/issue that it fixes in the description.
  • New Unit tests were added for the changes made and Travis.CI is passing.

Quality of Code and Contribution Guidelines


This change is Reviewable

README.md Outdated Show resolved Hide resolved
@trstringer trstringer changed the title Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination (WIP) Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Sep 13, 2019
@codecov
Copy link

codecov bot commented Sep 19, 2019

Codecov Report

Merging #1633 into develop will increase coverage by 0.19%.
The diff coverage is 68.75%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1633      +/-   ##
===========================================
+ Coverage    67.34%   67.54%   +0.19%     
===========================================
  Files           80       80              
  Lines        11433    11456      +23     
  Branches      1605     1604       -1     
===========================================
+ Hits          7700     7738      +38     
+ Misses        3393     3374      -19     
- Partials       340      344       +4
Impacted Files Coverage Δ
azurelinuxagent/pa/provision/factory.py 100% <100%> (ø) ⬆️
azurelinuxagent/common/conf.py 79.41% <100%> (ø) ⬆️
azurelinuxagent/pa/provision/cloudinit.py 58.06% <52.38%> (+31.74%) ⬆️
azurelinuxagent/pa/provision/default.py 73.97% <0%> (-1.03%) ⬇️
azurelinuxagent/common/cgroupapi.py 83.01% <0%> (+0.06%) ⬆️
azurelinuxagent/daemon/main.py 73.19% <0%> (+0.27%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e013ce...4e81894. Read the comment docs.

@trstringer trstringer changed the title (WIP) Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Sep 19, 2019
@trstringer trstringer changed the title Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination (WIP) Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Oct 2, 2019
@trstringer trstringer changed the title (WIP) Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Oct 8, 2019
@trstringer
Copy link
Contributor Author

Ready for review. // @jasonzio

README.md Outdated Show resolved Hide resolved
azurelinuxagent/common/conf.py Outdated Show resolved Hide resolved
Comment on lines +142 to +146
systemctl_output = subprocess.check_output([
'systemctl',
'is-enabled',
'cloud-init-local.service'
], stderr=subprocess.STDOUT).decode('utf-8').replace('\n', '')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will produce an exception if the service is missing.

I'm also not sure using systemctl is a particular good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An exception is ok. This is wrapped in a try/except so that any exception case (such as a missing service) will just result in a falsy return. As for using systemctl, it is the best option forward from what I've seen. I originally tried using dbus but there is no guarantee that the system dbus is up and running at this point, whereas systemd has it's own internal dbus for communication (which systemctl uses) that we can rely on at this boot stage. Let me know if you have any other hesitations for using systemctl otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not certain that calling systemctl from within a code path that runs as part of a systemd unit will always produce reliable results. I should but I don't know for certain nor have I found any documentation thereof.

This also makes the assumption that the generator (which is safe) has run, and properly detected Azure (there have been generator detection issues). Last but not least it assumes that all images run the generator that is part of cloud-init. There are other way to run cloud-init-local.service. While we can argue that that's not the "standard way" when/if someone has a "brilliant idea" and this code breaks I did want to point out at the implicit dependency that is being created by this check.

I am not advocating for a change, just making a comment.

@trstringer trstringer changed the title Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination (WIP) Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Oct 10, 2019
@anhvoms
Copy link
Collaborator

anhvoms commented Oct 10, 2019

LGTM

Copy link
Collaborator

@anhvoms anhvoms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 12 files at r2, 2 of 2 files at r3.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @larohra, @Moustafa-Moustafa, @narrieta, @pgombar, @trstringer, @vrdmr, and @waldiTM)

@trstringer
Copy link
Contributor Author

Thanks for the reviews, @jasonzio and @anhvoms. @narrieta, please hold off on merging this in just yet, I want to do a few more manual tests before we merge.

@trstringer
Copy link
Contributor Author

Validated with canonical:ubuntuserver:18.04-lts:latest and credativ:debian:9:latest.

@narrieta, let me know if there are any other things needed for this PR to get merged. Thanks!

@trstringer trstringer changed the title (WIP) Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Oct 16, 2019
Copy link
Member

@narrieta narrieta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments, but looks good to me.

I'll do an automation run and approve/merge afterwards.

specifying "auto"). Possible options are "auto" (default), "waagent", "cloud-init",
or "disabled".

#### __Provisioning.Enabled__ (*removed in VERSION*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't removing this be a breaking change? who are the likely users of those settings? thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These settings need to be made in the image from which a VM is created, and they're generally set by a user only when they create a new image. Simply updating waagent in an already-provisioned VM should have no impact, as the settings are ignored (since provisioning already happened).

I can imagine someone being impacted if they take an image they'd already created, update the waagent, and use that to create a new image. It might be worthwhile for the waagent package (deb or rpm) to have a post-install script check the config file for the presence of the Provisioning.Enabled config setting and emit a warning or error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also be a good idea for the provisioning code in waagent to detect the presence of Provisioning.Enabled and emit a warning in the log (and via telemetry). This should make diagnosis easier in the few cases where the change gets missed.

azurelinuxagent/common/conf.py Show resolved Hide resolved
@trstringer
Copy link
Contributor Author

@narrieta hold off on merging just yet, we're going to try to get another review of this before we move forward. Changing this to "work in progress" for the time being.

@trstringer trstringer changed the title Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination (WIP) Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Oct 16, 2019
@narrieta
Copy link
Member

@trstringer OK

Comment on lines +142 to +146
systemctl_output = subprocess.check_output([
'systemctl',
'is-enabled',
'cloud-init-local.service'
], stderr=subprocess.STDOUT).decode('utf-8').replace('\n', '')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not certain that calling systemctl from within a code path that runs as part of a systemd unit will always produce reliable results. I should but I don't know for certain nor have I found any documentation thereof.

This also makes the assumption that the generator (which is safe) has run, and properly detected Azure (there have been generator detection issues). Last but not least it assumes that all images run the generator that is part of cloud-init. There are other way to run cloud-init-local.service. While we can argue that that's not the "standard way" when/if someone has a "brilliant idea" and this code breaks I did want to point out at the implicit dependency that is being created by this check.

I am not advocating for a change, just making a comment.

azurelinuxagent/pa/provision/factory.py Show resolved Hide resolved
@trstringer trstringer changed the title (WIP) Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Add cloud-init auto-detect to prevent multiple provisioning mechanisms from relying on configuration for coordination Oct 24, 2019
@trstringer
Copy link
Contributor Author

@narrieta if this PR looks good it should be ready for merge.

@narrieta
Copy link
Member

running automation

@trstringer
Copy link
Contributor Author

@narrieta ready for re-running and consideration. Thanks!

@narrieta
Copy link
Member

@trstringer - Running automation

@narrieta
Copy link
Member

Automation OK

@narrieta narrieta merged commit 6211315 into Azure:develop Oct 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants