-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Host Update API integration #35
Conversation
4e7fd4f
to
77aa8dc
Compare
pkg/agent/agent.go
Outdated
// Get the updater interface version from the node label | ||
var platformVersion = node.Labels[marker.UpdaterInterfaceVersionKey] | ||
switch platformVersion { | ||
case "2.0.0": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to use exactly 2.0.0, or a semver-compatible range more akin to 2.x.y?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why not, but I also don't really see why we should. The platform version here is to differentiate hosts that can support the update API or not. There's really nothing much in between.
I think ideally the label would be more explicit. Instead of a version, it would just be the supported update platform, "updog" or "update-api".
pkg/agent/agent.go
Outdated
default: | ||
log.Info("unknown platform version specified, defaulting to using updog") | ||
fallthrough | ||
case "1.0.0": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above, do we want exactly 1.0.0?
a8893ea
to
44d3078
Compare
Push above addresses @samuelkarp and @jahkeup 's comments:
|
Testing wise, I got a node to update via the operator using the update API. However, I'm running into an issue where the agent repeatedly tries to prepare the update even after each attempt succeeds. The agent spins like this for a while until it finally realizes the update is ready then finally See below: Retrieves the available update fine,
Then the agent tries to
But then it goes back to posting the same intent again? It tries to re-
Then finally the agent decides to
Not entirely sure why this happens. I suspect it has something to with me sleeping to wait for the Logs from another attempt:
|
I don't think it is the sleep causing issue. The main event loop shouldn't be
We've seen similiar behavior with similiar messages. This makes me think it
When this happens, the Kubernetes control plane is literally rejecting the I think Reading over the rest of the func (a *Agent) postIntent(in *intent.Intent) error {
err := a.poster.Post(in)
if err == nil {
// track the successfully updated intent.
a.tracker.recordPost(in)
}
return err
} It's worth trying again with that method updated to see if the looping subsides. Also, if you are able to get the controller's logs at the same time that you saw I'll try out updating a cluster myself, though probably not until later today or |
Fixes postIntent logic so it properly records the intent if no error happened
44d3078
to
ccf3431
Compare
Fixed the issue described in #35 (comment) thanks to @jahkeup 's suggestion. Now the node properly updates without spinning in place:
|
ccf3431
to
e56a9cd
Compare
Addresses @samuelkarp 's comments. |
Tested the latest changes and they still work. |
e56a9cd
to
edf81ed
Compare
Push above addresses @jahkeup 's comments. Merged the two config files for deploying the operator. The diff with the original |
edf81ed
to
e2cc062
Compare
Addresses @samuelkarp 's comments. Added a comment above the 10 second timeout when creating the http client (forgot to do this in the previous set of changes, oops!) Push below fixes a grammar error. |
e2cc062
to
0d65f72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍄
|
||
func TestUnmarshallUpdateStatus(t *testing.T) { | ||
update_string := "Starting update to 0.4.0\n" | ||
cases := []struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, nice work creating these test cases.
0d65f72
to
6c655c0
Compare
Push above addresses #35 (comment) |
6c655c0
to
69da3d8
Compare
Addresses @jahkeup's comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with everything here except for formatting. I can re-approve when you run gofmt
again.
Adds support for updating via the Bottlerocket Update API. The new platform is used if the node is labeled bottlerocket.aws/updater-interface-version=2.0.0
69da3d8
to
c2df97a
Compare
Push above fixes formatting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done! 👏 👍
I'm excited to see this roll out with the update API 😃
Issue number:
N/A
Description of changes:
Adds support for updating via the Bottlerocket Update API.
The new platform is used if the node is labeled
bottlerocket.aws/updater-interface-version=2.0.0
Testing done:
Agent successfully scheduled a pod onto a node and finished an update:
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.