update-agent: rework steady/polling state #233

lucab · 2020-02-27T11:53:20Z

This splits the "reported steady" and "checked, but no updates
available" states, making it easier to track and report upgrade
progress.

Closes: #229

lucab · 2020-02-28T09:36:25Z

/cc @jlebon for review (#229 has the whole rationale)

jlebon · 2020-02-28T15:23:30Z

src/update_agent/actor.rs

@@ -135,7 +145,7 @@ impl UpdateAgent {
    }

    /// Initialize the update agent.
-    fn initialize(&mut self) -> ResponseActFuture<Self, Result<(), ()>> {
+    fn tick_initialize(&mut self) -> ResponseActFuture<Self, Result<(), ()>> {


Why the naming changes? Is it to represent that these functions are directly called by the tick handler?

Yes (but mostly: a drive-by change to make code easier to skim/grep).

jlebon · 2020-02-28T15:27:55Z

src/update_agent/mod.rs

+    /// Node steady, agent allowed to check for updates.
+    ReportedSteady,
+    /// No further updates available yet.
+    NoNewUpdate,


Bikeshed: WDYT about naming the NoNewUpdate one with the steady keyword since it's likely going to be the stable/steady state in which the agent will mostly sit in? And maybe instead of ReportedSteady, something like ReadyToCheck? Makes it a little more obvious too that it hasn't checked for updates yet.

Unfortunately I'm a dummy so I have already coupled the meaning of "steady" with

zincati/src/fleet_lock/mod.rs

Line 25 in 6bad305

static V1_STEADY_STATE: &str = "v1/steady-state";

So, to avoid confusion I'd rather stick with the definition of "steady" as "we released the lock, marking the current release green".

Hmm, is coreos/airlock#1 still open for change? Would it make sense to change the specs here so that we only actually release the lock once we've also been able to check for an update? It intuitively makes sense to me to do that, but I might be missing something.

Hmm, is coreos/airlock#1 still open for change?

Not really anymore.

Would it make sense to change the specs here so that we only actually release the lock once we've also been able to check for an update?

I don't think so. We don't want to couple the "did the node reboot" question with the "is Cincinnati-service ok" question inside Zincati. Plus, users can enforce stronger pre-conditions via https://github.com/coreos/zincati/blob/master/dist/systemd/system/zincati.service#L9-L10

Ahh OK right. I'd like to chat about this some more, but I don't think it blocks this PR as is.

Forwarded to #239.

jlebon · 2020-02-28T15:31:51Z

src/update_agent/mod.rs

+    #[allow(dead_code)]
    fn discriminant(&self) -> u8 {


Do we actually need this function? The only place that was using it is now gone.

Yes, dropped. I was planning to use it for checking the "from" state on transition, but I realized I cannot do it that way.

Self-note: one approach to explore here is to statically enforce allowed transitions via generics, like https://hoverbear.org/blog/rust-state-machine-pattern/#generically-sophistication.

jlebon · 2020-02-28T15:33:05Z

Rationale overall makes sense to me otherwise!

This moves metrics logic for the agent, unifying it into a single place.

This splits the "reported steady" and "checked, but no updates available" states, making it easier to track and report upgrade progress.

lucab · 2020-02-28T16:52:03Z

Rebased.

jlebon

LGTM!

lucab added kind/groundwork area/updates labels Feb 27, 2020

lucab requested a review from jlebon February 27, 2020 11:53

lucab added this to the vNext milestone Feb 27, 2020

jlebon reviewed Feb 28, 2020

View reviewed changes

lucab added 2 commits February 28, 2020 16:48

update-agent: unify metrics logic

578f553

This moves metrics logic for the agent, unifying it into a single place.

update-agent: rework steady/polling state

878a1a1

This splits the "reported steady" and "checked, but no updates available" states, making it easier to track and report upgrade progress.

lucab force-pushed the ups/split-steady-state branch from b91be90 to 878a1a1 Compare February 28, 2020 16:50

jlebon approved these changes Feb 28, 2020

View reviewed changes

lucab mentioned this pull request Mar 2, 2020

design: clarify when it is correct to report steady #239

Open

lucab merged commit b1c0936 into coreos:master Mar 2, 2020

lucab deleted the ups/split-steady-state branch June 8, 2020 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update-agent: rework steady/polling state #233

update-agent: rework steady/polling state #233

lucab commented Feb 27, 2020

lucab commented Feb 28, 2020

jlebon Feb 28, 2020

lucab Feb 28, 2020

jlebon Feb 28, 2020

lucab Feb 28, 2020

jlebon Feb 28, 2020

lucab Feb 28, 2020

jlebon Feb 28, 2020

lucab Mar 2, 2020

jlebon Feb 28, 2020

lucab Feb 28, 2020

lucab Mar 2, 2020

jlebon commented Feb 28, 2020

lucab commented Feb 28, 2020

jlebon left a comment

update-agent: rework steady/polling state #233

update-agent: rework steady/polling state #233

Conversation

lucab commented Feb 27, 2020

lucab commented Feb 28, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlebon commented Feb 28, 2020

lucab commented Feb 28, 2020

jlebon left a comment

Choose a reason for hiding this comment