adding the node object to run_converge messages, resource schema cleanup #208

adamleff · 2016-06-28T16:59:39Z

Including the node object in run_converge messages will allow
consumers of Data Collector messages to filter converge completion
messages based on node data more easily than performing additional
queries or joins. This also reduces the number of Data Collector
messages sent by the Chef Client at the end of the run from 2 to 1.

Additionally, the resource schema included in the run_converge
message didn't match up to what was actually used by Chef Analytics
(and expected by other Chef projects) so it was cleaned up to match
appropriately.

Including the node object in run_converge messages will allow consumers of Data Collector messages to filter converge completion messages based on node data more easily than performing additional queries or joins. Additionally, the resource schema included in the run_converge message didn't match up to what was actually used by Chef Analytics (and expected by other Chef projects) so it was cleaned up to match appropriately.

btm · 2016-06-30T15:38:50Z

rfc077-mode-agnostic-data-collection.md

@@ -525,6 +445,7 @@ The Run End Schema will be used by Chef Client to notify the data collection ser
        "expanded_run_list",
        "message_type",
        "message_version",
+        "node",


I'm still thinking about this, but do we really want to require a full node object in this message?

I (and others) think we do. A receiver of these messages is likely to want to filter by a variety of properties in order to get the run_converges they want... environment, role, platform, etc. Without knowing all the different properties a user would want and having to constantly change the schema to add them each time a new use case is discovered), putting the node object on this message makes a lot of sense (IMO).

This doesn't actually increase the amount of data the Data Collector is sending as, prior to this change, we sent two messages at the end of the run - the run_converge message, and the action message with the node object. We're proposing consolidating them so users don't have to marry the two messages on their own (via JOINs or whatever), and this reduces the end-of-run messages to a single message.

The expectation is a converge is changing the state of a node (or at least asserting the state is what it currently is), and sending the node along with the converge-is-completed message is like handing over proof of that converge. I'm actually bummed we didn't do this initially :)

yeah at some point we probably better want to filter or manage the contents of the node object itself, but we likely want to do that as a core feature

btm · 2016-06-30T15:49:55Z

I didn't pay a ton of attention to this schema in the past, but it looks like you've got three messages:

send the node object
start a run
finish a run

Is that right?

adamleff · 2016-06-30T15:50:54Z

Correct - and we plan to stop sending the "node object" message from Chef Client but still leave the schema defined as there are other implementations that will use that schema, such as the Chef Server sending an action message when a user edits an object with knife.

btm · 2016-06-30T15:53:42Z

And then this PR is putting the node object in #3 to avoid having to send both a #1 and #3 message at the end of the run? Yeah, sounds it.

I'm a little surprised we'd go through the trouble of having different schemas for the start and end of the run. They're feeling like really heavy use specific schemas, rather than light schemas that could be used for multiple use cases. Like a resource_updated schema that could send a message for every converged resource or every five minutes or whatever you wanted so the data collected is more real-time per a run.

Sending one message that represented the run and another that represented a node object state seemed like it made the schemas more flexible.

adamleff · 2016-06-30T15:57:46Z

For us internally a Chef, much of it was based on reusing existing schemas that were in use for other Chef products, so we're iterating over those current schemas as the needs of our current product-under-development evolved, a product which started consuming those existing schemas initially.

I would love to see us have per-event-method schemas and allow users to consume them, but that feels like a later phase or evolution of this. For now, this PR is trying to focus on tying together node details with the completion of a converge so users can more easily and accurately filter their converge messages by node details for reporting purposes.

lamont-granquist · 2016-06-30T16:11:25Z

👍

btm · 2016-06-30T16:27:59Z

as the decider this week, this was approved at today's meeting.

btm · 2016-06-30T16:29:02Z

We should chat elsewhere to see if we can steer this schema back towards one that is more general and project/product agnostic.

adamleff mentioned this pull request Jun 28, 2016

Adding node object to Data collector run_converge message chef/chef#5065

Merged

adamleff force-pushed the adamleff/rfc77-schema-update branch from 0381339 to 0b6308c Compare June 28, 2016 19:27

adamleff changed the title ~~adding the node object to run_converge messages~~ adding the node object to run_converge messages, resource schema cleanup Jun 28, 2016

adamleff force-pushed the adamleff/rfc77-schema-update branch from 0b6308c to 6a56787 Compare June 28, 2016 19:45

btm reviewed Jun 30, 2016
View reviewed changes

btm merged commit 612943e into master Jun 30, 2016

btm deleted the adamleff/rfc77-schema-update branch June 30, 2016 16:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding the node object to run_converge messages, resource schema cleanup #208

adding the node object to run_converge messages, resource schema cleanup #208

adamleff commented Jun 28, 2016 •

edited

Loading

btm Jun 30, 2016

adamleff Jun 30, 2016

adamleff Jun 30, 2016

lamont-granquist Jun 30, 2016

btm commented Jun 30, 2016

adamleff commented Jun 30, 2016

btm commented Jun 30, 2016

adamleff commented Jun 30, 2016

lamont-granquist commented Jun 30, 2016

btm commented Jun 30, 2016

btm commented Jun 30, 2016

adding the node object to run_converge messages, resource schema cleanup #208

adding the node object to run_converge messages, resource schema cleanup #208

Conversation

adamleff commented Jun 28, 2016 • edited Loading

btm Jun 30, 2016

Choose a reason for hiding this comment

adamleff Jun 30, 2016

Choose a reason for hiding this comment

adamleff Jun 30, 2016

Choose a reason for hiding this comment

lamont-granquist Jun 30, 2016

Choose a reason for hiding this comment

btm commented Jun 30, 2016

adamleff commented Jun 30, 2016

btm commented Jun 30, 2016

adamleff commented Jun 30, 2016

lamont-granquist commented Jun 30, 2016

btm commented Jun 30, 2016

btm commented Jun 30, 2016

adamleff commented Jun 28, 2016 •

edited

Loading