-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support UTF-8 label matchers in Alertmanager #3353
Support UTF-8 label matchers in Alertmanager #3353
Conversation
Thank you @grobinson-grafana for such a thorough description. Can you please clarify
What are those cases that would cause a breaking change? |
Those are in the Restrictions 🙂
|
Hi, all! 👋
|
To add a little more information about ambiguous/inconsistent matchers, the current version of Alertmanager allows the following:
The changes proposed here would also restrict all of the examples above as 1. trailing commas are rejected and 2. unquoted text must match the regular expression [a-zA-Z_:][a-zA-Z0-9_:]*. |
About the breaking changes:
|
I think trailing commas and quoting characters (single quotes and backticks) can be supported without much work. Supporting unquoted |
When I first started working on this about a month ago, I had intended to remove support for unquoted label values, but then later relaxed this to
The question I have then is what was the motivation for supporting both double quoted and unquoted regular expressions and literals in Alertmanager when double quoting seems to be a requirement in both Prometheus and OpenMetrics specifications? Knowing this, I would instead like to restrict the grammar further and require both regular expressions and literals to be double quoted as I had first intended. It's important to highlight that this would be a further breaking change, but it seems like a good one with respect to a.) compliance and b.) simplicity. |
You were looking at the text-based exposition formats (classic Prometheus and OM). I was talking about PromQL. PromQL is actually older than those text-based exposition formats, and while the latter had been inspired by the former, they are not the same. (For the sake of historical completeness: The first two exposition formats were protobuf-based (not needing any quoting) and JSON-based (following JSON quoting rules, but the JSON-based format disappeared early in the history of Prometheus).) You might ask the question why PromQL decided to be liberal with quoting character, and why the later text-based exposition formats did not follow that lead, but it's fairly academic at this point. The inspiration for the matchers in AM are the matchers in PromQL. (Note that the exposition formats do not have matchers.) And they followed the quoting rules of PromQL. I wouldn't deviate from that now because it would decrease consistency. (And I would not give the simplicity argument much weight because some people see it the other way, that it's simpler if you can use any quoting character.) |
I think that's a valid argument. (Just to clarify that my objection is only about removing the trailing comma and the alternative quote characters.) |
1c8970d
to
04037f6
Compare
I was looking at both PromQL and the exposition format - but while the exposition format requires regular expressions and literals to be double quoted, I see that PromQL also allows for single quotes and backticks. However, I didn't know that single quotes and backticks were also supported in PromQL when writing the original comment, and I apologize if the original comment was confusing! 🙂
This is useful to know, thanks! 😊
I see that PromQL is liberal with quoting in that single quotes, double quotes and backticks are supported. I think I am less concerned about that, but more concerned about supporting unquoted literals and regular expressions, as permitted in matchers for current versions of Alertmanager. For example
Does the exposition format also support single quotes and backticks, as I had understood it was just double quotes?
I think we can change it to support both single quotes and backticks, however I don't think those are support in matchers for current versions of Alertmanager either? For example |
This morning I had a call with @gotjosh! We looked at both
The biggest surprise for me is learning that 1. I do have pending changes where examples such as |
Note that the exposition formats do not deal with regular expressions at all. The exposition format is just not a good reference for matcher syntax because it (a) does not contain matchers itself and (b) its syntax is inspired by PromQL syntax, but it is not the same and it is also solving a different problem. I would vote for PromQL as the role model for matchers. That's what's had been done in the past, plus added tolerance around trying to "guess" what the user means (which does have the potential for trouble, see below).
Yes, totally possible. That's probably because I wasn't aware that backticks are allowed in PromQL when I coded the AM matchers. In summary, I think we should definitely support valid PromQL matcher syntax in AM land (implying that we have to add backtick support). And we may keep (or even add) some tolerance for "not quite correct" syntax, but we should avoid those parts that create trouble. (Which means I'm totally on board to require certain quoting where ambiguities arise, as you have explained above.) And we should have consistency in the level of tolerance between amtool and the AM UI (which isn't the case right now because the UI is less tolerant). |
That's good because it's what I was thinking too! It would mean that support for unquoted literals and regular expressions would be dropped, as those are not permitted in PromQL, and instead replaced with support for double quotes, single quotes and backticks.
I think also support for opening and closing parens in both the UI and
I would argue trailing commas are a good example of "not quite correct". The argument I have against trailing commas is that here commas indicate there is another matcher after this one, so either the user has forgotten to include it or appended additional text by mistake (which could be a comma). However, since it's also supported in PromQL we should support it here too! 🙂 |
The trailing comma thing (in PromQL and from there propagating into AM matchers) was probably done with the intention of simplifying auto-generation of selectors. (You programmatically iterate through a list of labels and want to create a selector from it. If you now have to special case the last iteration to not render a comma, everything gets clunkier.) Many got burned by the absence of trailing commas in JSON (and jsonnet allowed them for that same reason, I presume). |
I think the tolerance of missing curly braces and quotes was really coming from It's ultimately the call of the AM maintainers, but I would like to see a moderate amount of tolerance being kept. (And I think that's what you propose with this PR.) We shouldn't allow syntax that requires backtracking in parsing, as you said above. The UI being more strict is probably mostly a result that the shell quoting issue doesn't arrive there. And then the parsing is implemented again in the UI to give you validation even before you submit. |
As if we aren't juggling enough balls already, here's another thought: We want to allow all of UTF-8 in label names as well, which means we need quoting of the label name in matchers in that case, too. Not sure if that needs to be taken into account here to avoid another invasive change later. |
This week I've been working on optional opening and closing braces and trailing commas so those will be supported too 🙂
Sounds good to me!
That makes sense to me, but I think at some point the UI and amtool should be made consistent (perhaps using the grammar in the description). However, that's not this PR.
That's supported 🙂 However, like values, non |
I also want to highlight that |
a53cfeb
to
cceb537
Compare
I've just updated this PR to add support for optional open and close parens and optional trailing commas as discussed with @beorn7. |
I've put together two utils that will assist the review of this pull request and also create tooling to help migrate incompatible matchers in user's their configuration files.
|
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
…rors Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
Signed-off-by: George Robinson <george.robinson@grafana.com>
e87751a
to
f13fb75
Compare
This commit wires up the new and old matchers parser, which can be switched at runtime via a command line flag.
This commit adds the new label matchers parser as proposed in prometheus#3353. Included is a number of compliance tests comparing the new parser with the existing parser in pkg/labels and can be run passing the "compliance" tag to go test.
This commit adds the new label matchers parser as proposed in prometheus#3353. Included is a number of compliance tests comparing the new parser with the existing parser in pkg/labels and can be run passing the "compliance" tag to go test. Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit adds the new label matchers parser as proposed in prometheus#3353. Included is a number of compliance tests comparing the grammar supported in the new parser with the existing parser in pkg/labels. The compliance tests can be run passing the "compliance" tag when running go test. Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit adds the new label matchers parser as proposed in prometheus#3353. Included is a number of compliance tests comparing the grammar supported in the new parser with the existing parser in pkg/labels. Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit adds the new label matchers parser as proposed in prometheus#3353. Included is a number of compliance tests comparing the grammar supported in the new parser with the existing parser in pkg/labels. Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit adds the new label matchers parser as proposed in prometheus#3353. Included is a number of compliance tests comparing the grammar supported in the new parser with the existing parser in pkg/labels. Signed-off-by: George Robinson <george.robinson@grafana.com>
This commit adds the new label matchers parser as proposed in prometheus#3353. Included is a number of compliance tests comparing the grammar supported in the new parser with the existing parser in pkg/labels. Signed-off-by: George Robinson <george.robinson@grafana.com>
* Add label matchers parser This commit adds the new label matchers parser as proposed in #3353. Included is a number of compliance tests comparing the grammar supported in the new parser with the existing parser in pkg/labels. Signed-off-by: George Robinson <george.robinson@grafana.com> --------- Signed-off-by: George Robinson <george.robinson@grafana.com>
Background
As explained in #3319, Alertmanager constrains label names to the characters
^[a-zA-Z_][a-zA-Z0-9_]*$
- the same as Prometheus.However, because Alertmanager is such a well known and popular open source project it makes sense to use it for other kinds of alert generators too, and not just Prometheus. An example of this is Grafana, where Alertmanager is used to manage alerts created from both Prometheus and non-Prometheus like datasources, including Graphite, InfluxDB, SQL, Loki and more. The issue with this though is that these other datasources do not share the same constraints as Prometheus when it comes to label names, and so using them with Alertmanager requires some kind of normalization to ensure labels match the set of allowed characters before the alert can be sent to the Alertmanager.
What this pull request does
@yuri-tceretian has proposed adding support for UTF-8 characters in #3321 that updates the existing regular expression to support double quoted UTF-8 sequences as label names, while keeping unquoted label names the same.
In this pull request I wanted to propose a different option to where we would add a "simple" LL(1) parser to Alertmanager to parse matchers instead.
Motivation
The original motivation for writing this parser was to add support for matching label names containing
.
and spaces to grafana/grafana. However, about the same time I learned that Prometheus maintainers agreed to add support for UTF-8 labels in Alertmanager, and so I decided to further the work to see if it could be upstreamed to Alertmanager instead.The original source code can be found at grobinson-grafana/matchers.
Supported grammar
This LL(1) parser in its current version is not 100% compatible with the existing regular expression, although it is close and can be modified if required. The grammar can be understood as follows:
Here are some examples of valid inputs:
and some examples of invalid inputs:
Breaking changes
#### Expressions must start and end with open and closing bracesAll expressions must start and end with{
and}
, although this can be relaxed if required. For examplefoo=bar
is not valid, it must be{foo=bar}
.#### Trailing commas are not permittedTrailing commas are not permitted. For example{foo=bar,}
is not valid, it must be{foo=bar}
.All non
[a-zA-Z_:][a-zA-Z0-9_:]*
values must be double quotedThe set of unquoted characters is now the same on both sides of the expression. In other words, both label names and label values without double quotes must match the regular expression
[a-zA-Z_:][a-zA-Z0-9_:]*
. For example{foo=!bar}
is not valid, it must be{foo="!bar"}
. In current versions of Alertmanager, unquoted label values can contain all UTF-8 code points with the exception of comma, such as{foo=!bar}
.There are two reasons for this:
It's no longer possible to write ambiguous matchers which I feel is something Alertmanager should fix. For example is
{foo=~}
equivalent to{foo="~"}
or{foo=~""}
?If we restrict the
=
,!
,~
characters to double quotes we can keep the grammar LL(1). Without this restriction lookahead/backtrack is required to parse matchers such as{foo==~!=!~bar}
which are valid in current versions of Alertmanager.Errors
One of the goals with this LL(1) parser is to provide better error messages than what is possible using just a regular expression. For example:
Benchmarks
I've also provided a number of benchmarks of both the LL(1) parser and regex parser that supports UTF-8. These can be found at grobinson-grafana/matchers-benchmarks. However, to run them
go.mod
must be updated to use the branch https://github.com/grafana/prometheus-alertmanager/tree/yuri-tceretian/utf-8-label-names here.