-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(config): fix invalid enum schemas + provide more enum metadata #14586
Conversation
✅ Deploy Preview for vector-project canceled.
|
Soak Test ResultsBaseline: 6b8c9da ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two recommendations for clarity
Soak Test ResultsBaseline: fd5ba44 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
Soak Test ResultsBaseline: c3988f5 ExplanationA soak test is an integrated performance test for vector in a repeatable rig, with varying configuration for vector. What follows is a statistical summary of a brief vector run for each configuration across SHAs given above. The goal of these tests are to determine, quickly, if vector performance is changed and to what degree by a pull request. Where appropriate units are scaled per-core. The table below, if present, lists those experiments that have experienced a statistically significant change in their throughput performance between baseline and comparision SHAs, with 90.0% confidence OR have been detected as newly erratic. Negative values mean that baseline is faster, positive comparison. Results that do not exhibit more than a ±8.87% change in mean throughput are discarded. An experiment is erratic if its coefficient of variation is greater than 0.3. The abbreviated table will be omitted if no interesting changes are observed. No interesting changes in throughput with confidence ≥ 90.00% and absolute Δ mean >= ±8.87%: Fine details of change detection per experiment.
|
Context
This PR addresses two problems with the current configuration schema generation: types with invalid schemas, and a loss of information for enums and their variants.
First, we've fixed the schema for some common enums --
Compression
,Concurrency
, andTimeZone
-- that had invalid schemas due to their use of specialized (de)serialization logic. As the types themselves did not have a shape that matched what we would allow deserializing from, this meant the schema did not accurately reflect what users can put in a configuration.We've manually implemented
Configurable
for these types, with the hope that we can further enhance the procedural macros in the future to allow more control around describing the true shape of a type, or potentially move them over to using specializedserde
(de)serialization helpers where we can do these hand-rolled implementations in a single place, and vet them accordingly.Second, we've added a new piece of metadata for enum variants: the "logical" name. This is a new metadata attribute that gets added to the schema for every enum variant and specifies the ident of the variant itself.
In many cases, the title/description of an enum variant are more human-centric in terms of describing how the variant is used, but there's also a common need/desire for a friendly identifier that can be used to differentiate variants, without necessarily having to add hard-coded logic that has to know that we use const schemas to describe enum tagging fields, and so on.
With
logical_name
, consumers of the schema can now actually better detect when a schema represents a true enum, instead of situations dealing with multiple parsing strategies (i.e.Compression
allowing a single string or object).