Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type field to AlertingRule and RecordingRule struct #1558

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sarthaktyagi-505
Copy link

@sarthaktyagi-505 sarthaktyagi-505 commented Jul 10, 2024

Describe your PR

AlertingRules and ReocrdingRules are missing the type field in their struct, this leads to json unmarshalling issues in the library. This returns an error on client side which looks like

 <*errors.errorString | 0xc0002c9c50>: {
          s: "v1.RulesResult.Groups: []v1.RuleGroup: unmarshalerDecoder: failed to decode JSON into an alerting or recording rule, error found in #10 byte of ...|00:00Z\"}]}]}|..., bigger context ...|Time\":0,\"lastEvaluation\":\"0001-01-01T00:00:00Z\"}]}]}|...",
      }
      v1.RulesResult.Groups: []v1.RuleGroup: unmarshalerDecoder: failed to decode JSON into an alerting or recording rule, error found in #10 byte of ...|00:00Z"}]}]}|..., bigger context ...|Time":0,"lastEvaluation":"0001-01-01T00:00:00Z"}]}]}|...
  occurred


// following stacktrace
type field not present in rule

In order to fix this I added the type field to both the structs. There is a specific check for type field in Unmarshall functions for both the structs as can be seen in the following lines of code.

AlertingRule -> https://github.com/prometheus/client_golang/blob/main/api/prometheus/v1/api.go#L740-L751
RecordingRule -> https://github.com/prometheus/client_golang/blob/main/api/prometheus/v1/api.go#L785-L796

It works now and is able to unmarshall the object and return a valid RuleResult object back which can be marshalled in json on client side.

What type of PR is this?

/kind bugfix

Changelog Entry

[BUGFIX] fix GetRules API response to be json marshallable. #1557

Changelog Entry

	api: add  type field to AlertingRule and RecordingRule struct.

C.C - @ArthurSens @bwplotka @kakkoyun

Signed-off-by: sarthak.tyagi <sarthaktyagi100@gmail.com>
Signed-off-by: sarthak.tyagi <sarthaktyagi100@gmail.com>
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the detailed context and the links it made it easier to review.

@kakkoyun kakkoyun requested a review from ArthurSens July 11, 2024 07:42
@sarthaktyagi-505
Copy link
Author

Hi @ArthurSens can you have a look at this PR whenever you get time.

Thanks in advance
Sar

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, not sure if I get what's the root cause of your error. Something is off. Why setting up Type is necessary. Looking on (r *AlertingRule) UnmarshalJSON(b []byte) erro it only cares about type in encoded message, not in our struct. I suspect you marshal client_golang object first and then try to unmarshal?

In other words, let's add regression test first to understand your use case. I think our unmarshaller works generally fine without this PR in terms of unmarshalling (it could use some improvement in err handling though).

@@ -612,6 +613,7 @@ type RecordingRule struct {
LastError string `json:"lastError,omitempty"`
EvaluationTime float64 `json:"evaluationTime"`
LastEvaluation time.Time `json:"lastEvaluation"`
Type string `json:"type"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we look on Rules type commentary you see the canonical way to identify if it's recording or alerting is through a type system. This means we don't need this type field here.

From the error you are getting, it sounds like you have problem with something marshalling this type back to JSON and then something unmarshalling (decoding) and expecting type field? In this case:

  1. For sure we should have regression test and agree it's a valid use case. Can you confirm that is the case? Simple call of API that uses unmarshalling should just work without Type in this code -- unless I am wrong, but reproduction test will verify (:
  2. To solve it, we could create MarshalJSON function that adds that field just for marshalling, WDYT? Alternatively we could keep this PR, but perhaps change the commentary on Rules slice? Otherwise we will have multiple ways of doing same thing.

To sum up, regression test that breaks on old code would be amazing, in minimum, WDYT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in our use case we simply call the Rules endpoint using the prometheus API here https://github.com/prometheus/client_golang/blob/main/api/prometheus/v1/api.go#L1202. In doing so it calls the ruler endpoint and returns us a RulesResult struct which we just use as a data field in response json. But it seems like when internally in api.go we are trying to create unmarshall RuleResult struct and while doing so unmarshall alertingRule and recordingRule it expects the struct to have the type field to unmarshal the json successfully which it gets from the ruler endpoint.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it calls the ruler endpoint and returns us a RulesResult struct which we just use as a data field in response json. But it seems like when internally in api.go we are trying to create unmarshall RuleResult struct and while doing so unmarshall alertingRule and recordingRule it expects the struct to have the type field to unmarshal the json successfully which it gets from the ruler endpoint.

Cool. So literally Prometheus server is targeted, right? Then Prometheus server should give us type field and our code should work without this PR. In fact alertingRule.Type = string(RuleTypeAlerting) line does nothing given you added type field, but even without type field unmarshal should work because of anonymous struct with type we unmarshal first.

BUT, if you notice a problem there might be one, so the best is to create a very small unit test that e.g. mock Prometheus API returning recording and alerting with type specified. Then without your change it should fail, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have enough tests so it would be amazing anyway 🤗 - if not done I can try do it, but it will take time (busy).

@@ -721,11 +723,13 @@ func (rg *RuleGroup) UnmarshalJSON(b []byte) error {

for _, rule := range v.Rules {
alertingRule := AlertingRule{}
alertingRule.Type = string(RuleTypeAlerting)
if err := json.Unmarshal(rule, &alertingRule); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So first of all this code was wrong, we don't even capture error anywhere no? How you got that type field not present in rule if it's not propagated here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are not justing the UnmarshalJSON function at all now that I see, we should be using UnMarshalJSON instead of json.UnMarshal what do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think changing the commentary on the Rules might be a better way since then we have a clear way of checking how the type field exists which is not clear right now. But you know the code better than me so happy to comply with what you think :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, let's add unit test for your "broken" case and we will go from there (:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants