fix(apt): Don't warn on 822 source format #5028

holmanb · 2024-03-07T23:51:32Z

Proposed Commit Message

fix(apt): Don't warn on 822 source format

spurious warnings, probably shouldn't do that to users

holmanb · 2024-03-07T23:51:55Z

fyi @cjp256

blackboxsw · 2024-03-08T04:49:35Z

cloudinit/config/cc_apt_configure.py

@@ -708,15 +708,15 @@ def generate_sources_list(cfg, release, mirrors, cloud):
        )
        if expected_content:
            if expected_content != util.load_text_file(apt_sources_list):
-                LOG.warning(


I think there are a couple integration tests that expect exit 2 due to deb822 warnings we know are in logs. You may need to fix those up as part of this PR.

Any clue which? I checked all of the tests that reference NOBLE and didn't see anything that looked like it would exit 2 due to this.

~~tests/integration_tests/modules/test_cli.py::test_invalid_userdata~~. Nevermind, I see we already fixed with this commit. 4e87db2. I don't see anything else that is checking for result.return_code == 2 from cloud-init status that isn't also intentionally setting up a failure path.

cjp256 · 2024-03-08T15:26:40Z

Related note. I'm wondering if the status code really just pushed warnings into the error category. We kind of lost that middle ground.

I know we make use of warnings in the Azure data source to indicate soft (retriable) failures that may be delaying boot. Similar to this, I now have to debate whether or not downgrade warnings to info, but then users will no longer get the heads up on the console that DHCP/IMDS/etc. failed and we're retrying so sorry about the delay....

I also understand that a lot of warnings really should be errors.

Thoughts? Should we have an API to push a log out to console out that's not at warning+ level?

holmanb · 2024-03-08T18:03:26Z

Thanks for the thoughts @cjp256.

Related note. I'm wondering if the status code really just pushed warnings into the error category. We kind of lost that middle ground.

I know we make use of warnings in the Azure data source to indicate soft (retriable) failures that may be delaying boot. Similar to this, I now have to debate whether or not downgrade warnings to info, but then users will no longer get the heads up on the console that DHCP/IMDS/etc. failed and we're retrying so sorry about the delay....
I also understand that a lot of warnings really should be errors.

Yeah, that's not ideal. And I agree that if cloud-init was capable of meeting users expectations, even something less ideal happened, that we probably shouldn't consider that an error / nonzero return code - return code 2 on logged warning may be too agressive.

What if we:

switch error code 2 to only occur when >= logging.ERROR messages are logged
allow logging.WARN messages to include messages about situations were cloud-init is not behaving ideally, but is able to complete the expected configuration
require logs that indicate that cloud-init wasn't able to complete a requested action to be logging.ERROR messages - (possibly bump the level on some of our current warnings)

This would align better with Python's the documented log levels, and would allow console messages without considering something recoverable error.

Should we have an API to push a log out to console out that's not at warning+ level?

That could work too.

blackboxsw

I'm good with us backing this behavior out to info level. In most cases in image production, we have addressed build artifacts leaving stale /etc/apt/sources.list files on the system. And this condition isn't really a warning more than an informational message that cloud-init corrected an invalid APT config file on the system that it used to silently blow away on first boot anyway. I double checked potential integration test impacts, but I think we are good now as 4e87db2 already expects happy state anyway, and no warnings from cloud-init status.

blackboxsw · 2024-03-09T20:50:09Z

Thanks for the thoughts @cjp256.

Related note. I'm wondering if the status code really just pushed warnings into the error category. We kind of lost that middle ground.
I know we make use of warnings in the Azure data source to indicate soft (retriable) failures that may be delaying boot. Similar to this, I now have to debate whether or not downgrade warnings to info, but then users will no longer get the heads up on the console that DHCP/IMDS/etc. failed and we're retrying so sorry about the delay....
I also understand that a lot of warnings really should be errors.

Yeah, that's not ideal. And I agree that if cloud-init was capable of meeting users expectations, even something less ideal happened, that we probably shouldn't consider that an error / nonzero return code - return code 2 on logged warning may be too agressive.

What if we:

switch error code 2 to only occur when >= logging.ERROR messages are logged

I really would prefer that we keep a higher visibility of warnings in our CLI exit codes if we can, though I recognize the cost dealing with these warnings is potentially painful if we are trying to remedy behavior changes quickly to avoid impacting some CI, scripts or automation which doesn't differentiate between exit 1 and exit 2+.

If we diminish the visibility of warnings again to just things dropped into logs I think we continue to avoid finding or tracking bugs or issues related to unexpected behaviors that otherwise go completely unnoticed (for years) as the warnings are buried in cloud-init's verbose logs.

I think we should consider each warning case and determine if it really needs to be a warning anymore as different issues arise on different platforms. We could perform a preliminary review of our warnings(~330) throughout code to evaluate whether we need to demote some of those to info level. We also have only another 100 info level log messages which we also try to use sparingly. As to whether Azure wants to surface info level logs to a console, I think we are looking at 11 info level logs per boot. Whether this is a reasonable volume or not depends on the dashboard/view into the console I suppose.

allow logging.WARN messages to include messages about situations were cloud-init is not behaving ideally, but is able to complete the expected configuration

require logs that indicate that cloud-init wasn't able to complete a requested action to be logging.ERROR messages - (possibly bump the level on some of our current warnings)

I think this is a good practice too. We should re-review/groom our warnings and decide what should ultimately be promoted to ERROR level as well.

fix(apt): Don't warn on 822 source format

98914a5

blackboxsw reviewed Mar 8, 2024

View reviewed changes

holmanb added the 24.1 label Mar 9, 2024

holmanb mentioned this pull request Mar 9, 2024

Updates for 24.1.1 #5036

Merged

1 task

blackboxsw approved these changes Mar 9, 2024

View reviewed changes

holmanb merged commit 52c6abd into canonical:main Mar 11, 2024
29 checks passed

holmanb added a commit to holmanb/cloud-init that referenced this pull request Mar 11, 2024

fix(apt): Don't warn on apt 822 source format (canonical#5028)

0a4b0c6

holmanb added a commit that referenced this pull request Mar 11, 2024

fix(apt): Don't warn on apt 822 source format (#5028)

f32228e

holmanb added a commit to holmanb/cloud-init that referenced this pull request Mar 12, 2024

fix(apt): Don't warn on apt 822 source format (canonical#5028)

91fd253

holmanb added a commit that referenced this pull request Mar 12, 2024

fix(apt): Don't warn on apt 822 source format (#5028)

e90a352

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(apt): Don't warn on 822 source format #5028

fix(apt): Don't warn on 822 source format #5028

holmanb commented Mar 7, 2024 •

edited

Loading

holmanb commented Mar 7, 2024

blackboxsw Mar 8, 2024

holmanb Mar 9, 2024

blackboxsw Mar 9, 2024 •

edited

Loading

cjp256 commented Mar 8, 2024 •

edited

Loading

holmanb commented Mar 8, 2024

blackboxsw left a comment

blackboxsw commented Mar 9, 2024

fix(apt): Don't warn on 822 source format #5028

fix(apt): Don't warn on 822 source format #5028

Conversation

holmanb commented Mar 7, 2024 • edited Loading

Proposed Commit Message

holmanb commented Mar 7, 2024

blackboxsw Mar 8, 2024

Choose a reason for hiding this comment

holmanb Mar 9, 2024

Choose a reason for hiding this comment

blackboxsw Mar 9, 2024 • edited Loading

Choose a reason for hiding this comment

cjp256 commented Mar 8, 2024 • edited Loading

holmanb commented Mar 8, 2024

blackboxsw left a comment

Choose a reason for hiding this comment

blackboxsw commented Mar 9, 2024

holmanb commented Mar 7, 2024 •

edited

Loading

blackboxsw Mar 9, 2024 •

edited

Loading

cjp256 commented Mar 8, 2024 •

edited

Loading