Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configdb: unsupported rule format version 0 #5387

Closed
friedrichg opened this issue Jun 5, 2023 · 4 comments · Fixed by #5443
Closed

Configdb: unsupported rule format version 0 #5387

friedrichg opened this issue Jun 5, 2023 · 4 comments · Fixed by #5443
Labels
component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. type/bug

Comments

@friedrichg
Copy link
Member

friedrichg commented Jun 5, 2023

Describe the bug
it's not possible to read or modify alertmanager config or rules in configdb
after 5578eeb

cortex operators using configdb cannot upgrade to v1.14.1 or v1.15.2

To Reproduce
Steps to reproduce the behavior:
1.

$ docker run --rm -p8000:80 cortexproject/cortex:master-5578eeb -target=configs -configs.database.uri=memory://
level=info ts=2023-06-05T17:11:21.943969Z caller=main.go:193 msg="Starting Cortex" version="(version=1.13.0, branch=master, revision=5578eeb)"
level=info ts=2023-06-05T17:11:21.944369Z caller=server.go:260 http=[::]:80 grpc=[::]:9095 msg="server listening on addresses"
level=info ts=2023-06-05T17:11:21.948459Z caller=module_service.go:64 msg=initialising module=server
level=info ts=2023-06-05T17:11:21.950176Z caller=module_service.go:64 msg=initialising module=configs
level=info ts=2023-06-05T17:11:21.950239Z caller=cortex.go:400 msg="Cortex started"
level=error ts=2023-06-05T17:11:24.003665Z caller=api.go:186 msg="invalid rules" err="unsupported rule format version 0"
curl -v  -XPOST -d '{"alertmanager_config":"{\"receivers\":[{\"name\":\"empty\"}],\"route\":{\"receiver\":\"empty\"}}","rules_files":{},"template_files":{}}' -H 'X-Scope-OrgID: fake' http://127.0.0.1:8000/api/prom/configs/alertmanager

Returns

< HTTP/1.1 400 Bad Request
< Content-Type: text/plain; charset=utf-8
< X-Content-Type-Options: nosniff
< Date: Mon, 05 Jun 2023 17:11:24 GMT
< Content-Length: 49
<
Invalid rules: unsupported rule format version 0

Expected behavior

Previous commit works fine: c930bd6

$ docker run --rm -p8000:80 cortexproject/cortex:master-c930bd6 -target=configs -configs.database.uri=memory://
level=info ts=2023-06-05T17:16:35.41202Z caller=main.go:193 msg="Starting Cortex" version="(version=1.13.0, branch=master, revision=c930bd6)"
level=info ts=2023-06-05T17:16:35.413964Z caller=server.go:260 http=[::]:80 grpc=[::]:9095 msg="server listening on addresses"
level=info ts=2023-06-05T17:16:35.418405Z caller=module_service.go:64 msg=initialising module=server
level=info ts=2023-06-05T17:16:35.420625Z caller=module_service.go:64 msg=initialising module=configs
level=info ts=2023-06-05T17:16:35.422841Z caller=cortex.go:400 msg="Cortex started"
$ curl -v  -XPOST -d '{"alertmanager_config":"{\"receivers\":[{\"name\":\"empty\"}],\"route\":{\"receiver\":\"empty\"}}","rules_files":{},"template_files":{}}' -H 'X-Scope-OrgID: fake' http://127.0.0.1:8000/api/prom/configs/alertmanager
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 127.0.0.1:8000...
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST /api/prom/configs/alertmanager HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.87.0
> Accept: */*
> X-Scope-OrgID: fake
> Content-Length: 136
> Content-Type: application/x-www-form-urlencoded
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 204 No Content
< Date: Mon, 05 Jun 2023 17:20:00 GMT
<
* Connection #0 to host 127.0.0.1 left intact

Environment:

  • Infrastructure: laptop but also failing in k8s

Additional Context
Problem is still present in latest cortex

$ docker run --rm -p8000:80 cortexproject/cortex:master-d988472 -target=configs -configs.database.uri=memory://
Unable to find image 'cortexproject/cortex:master-d988472' locally
master-d988472: Pulling from cortexproject/cortex
f56be85fc22e: Already exists
f93ec6642e04: Pull complete
aa4e4b25d019: Pull complete
78bd25eac331: Pull complete
Digest: sha256:8bcbadaaf6e77107fad6fcb5f8a2d16c2240c78ba4aadf464f14759456a1fe7f
Status: Downloaded newer image for cortexproject/cortex:master-d988472
ts=2023-06-05T17:24:04.779773Z caller=main.go:201 level=info msg="Starting Cortex" version="(version=1.15.1, branch=master, revision=d988472)"
ts=2023-06-05T17:24:04.781333Z caller=server.go:323 level=info http=[::]:80 grpc=[::]:9095 msg="server listening on addresses"
ts=2023-06-05T17:24:04.786709Z caller=module_service.go:64 level=info msg=initialising module=server
ts=2023-06-05T17:24:04.790977Z caller=module_service.go:64 level=info msg=initialising module=configs
ts=2023-06-05T17:24:04.791389Z caller=cortex.go:422 level=info msg="Cortex started"
ts=2023-06-05T17:24:16.082692Z caller=api.go:186 level=error msg="invalid rules" err="unsupported rule format version 0"

The following simple form fails too

curl -v  -d '{}' -H 'X-Scope-OrgID: fake' http://127.0.0.1:8000/api/prom/configs/alertmanager
@friedrichg
Copy link
Member Author

If we setup the rule_format_version to 2, looks like the problem is solved

$ curl -v  -d '{"rule_format_version": "2"}' -H 'X-Scope-OrgID: fake' http://127.0.0.1:8000/api/prom/configs/alertmanager
*   Trying 127.0.0.1:8000...
* Connected to 127.0.0.1 (127.0.0.1) port 8000 (#0)
> POST /api/prom/configs/alertmanager HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/7.87.0
> Accept: */*
> X-Scope-OrgID: fake
> Content-Length: 28
> Content-Type: application/x-www-form-urlencoded
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 204 No Content
< Date: Mon, 05 Jun 2023 18:21:57 GMT
<
* Connection #0 to host 127.0.0.1 left intact

@friedrichg friedrichg changed the title Configsdb: unsupported rule format version 0 Configdb: unsupported rule format version 0 Jun 5, 2023
@friedrichg
Copy link
Member Author

friedrichg commented Jun 5, 2023

the other problem related to this is that if we upload rules using configs from v1.13.2 and we upgrade later configs. Rulers will fail for all tenants quietly to

level=error ts=2023-06-05T18:30:30.919144332Z caller=ruler.go:490 msg="unable to list rules" err="unsupported rule format version 0"

Which renders the ruler unusable for all tenants, not just the one that an unsupported rule format.

@jeromeinsf
Copy link
Contributor

@rajagopalanand

@friedrichg
Copy link
Member Author

one can find offending configurations with

select id,owner_id,config,deleted_at from configs where config::text like '%"rule_format_version": "1"}%';

Please notice this query returns also historical configurations.
To check current configuration one has to check the latest with something like

select id,config from configs where owner_id='tenant' order by id desc limit 1;

After deleting the offending configurations, everything is working is fine.

@friedrichg friedrichg added the component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. label Jun 22, 2023
friedrichg added a commit that referenced this issue Jul 6, 2023
Fixes #5387

Signed-off-by: Friedrich Gonzalez <friedrichg@gmail.com>
friedrichg added a commit that referenced this issue Jul 6, 2023
Fixes #5387

Signed-off-by: Friedrich Gonzalez <friedrichg@gmail.com>
alanprot pushed a commit that referenced this issue Jul 6, 2023
Fixes #5387

Signed-off-by: Friedrich Gonzalez <friedrichg@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants