Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a match and/or in (also not-in) operator #822

Closed
kentquirk opened this issue Aug 1, 2023 · 6 comments
Closed

Create a match and/or in (also not-in) operator #822

kentquirk opened this issue Aug 1, 2023 · 6 comments
Assignees
Labels
type: enhancement New feature or request
Milestone

Comments

@kentquirk
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Sometimes it would be nice to be able to test a field for multiple values or a more complex form of containment, since the rules language doesn't support an or semantic.

Describe the solution you'd like

One option is an operator in, where the Value field is a list of possible values. If the specified field matches any of the values then it would succeed. This requires allowing the value field to be a list and adding code to support it when reading configuration files. It could support lists of any datatype -- for example, a list of numeric error codes.

This is easy to document and validate, but limited in that it only supports a list of exact matches.

An alternative is an operator match, which would accept for its Value field a Go-style regular expression. During parse time the Matches function would be constructed to pre-parse the regexp so that rule evaluation is just calling the match function on it. We would also have to make the validation code check that it could be parsed. All columns would be coerced to strings before matching (it would have an implicit Datatype of "string").

This is a bit harder to explain to non-programmers, but powerful in that it can match a list like 'in' would use, but could also match more complex patterns.

Context
See original request in Pollinators slack.

@kentquirk kentquirk added the type: enhancement New feature or request label Aug 1, 2023
@kamalmarhubi
Copy link

kamalmarhubi commented Aug 3, 2023

Original requestor here :-)

For string types, regexp is more general for sure. But I think, even for programmers, writing the ^(option|another_option|a_third_option)$ regexps gets pretty ugly / unwieldy once you're past about 3–5 options—my current rule has eight. Under the hood, the in operator could be desugared to regexp since that'd probably have better performance.

I think it'd also be nice to round out the rules language with disjunction and negation. :-)


Aside: I had secretly been hoping that since you and @TylerHelmuth are owners of pkg/ottl that refinery might learn to use that for rules. But since v2 came without that, I figure that's not actually on the roadmap 🙈. (I realise it's difficult to make that fit with the default Scope: trace semantic for rules!)

@TylerHelmuth
Copy link
Contributor

OTTL in Refinery might happen some day, but it isn't a top priority at the moment. We'd have to build our own contexts and the framework very much wants you to be using the collector's underlying data structure to be otlp but we have honeycomb events.

We are still very much involved with OTTL in the collector.

@kentquirk
Copy link
Contributor Author

I'm actually pretty tempted to do both. They seem to have distinct use cases. I'm going to plan to do at least one of them for 2.2.

@kentquirk kentquirk added this to the v2.2 milestone Aug 3, 2023
@tdarwin
Copy link
Contributor

tdarwin commented Aug 8, 2023

Similar to having an in or match to be able to handle matching against an array of possible values, it would be nice to have the a negative corollary, like not-in. Though match being regex would cover both via regex's own negative match functionality, but not-in would be just as useful as in in a lot of cases.

@kentquirk kentquirk modified the milestones: v2.2, v2.3 Nov 28, 2023
@kentquirk kentquirk changed the title Create a match or in operator Create a match and/or in operator Dec 5, 2023
@kentquirk kentquirk changed the title Create a match and/or in operator Create a match and/or in (also not-in) operator Dec 5, 2023
@kentquirk kentquirk self-assigned this Dec 6, 2023
kentquirk added a commit that referenced this issue Dec 11, 2023
## Which problem is this PR solving?

- Adds a regular expression `matches` operator to rules which should
make some rules easier to write, especially when dealing with URLs or
complex string fields.

## Short description of the changes

- Add `matches` operator
- Remove some unused switch cases from `conditionMatchesOperator` (to
avoid implementing another one)
- Add tests for it

This is part of #822.

This one will need further documentation, but that will be addressed in
a separate PR.
@MikeGoldsmith
Copy link
Contributor

@kentquirk I believe this can be considered closed pending #939 has now merged?

@kentquirk
Copy link
Contributor Author

@MikeGoldsmith I was also working on in and not-in but those operators will require Value to be able to take an array, which doesn't work given our YAML libraries. I'd have to take a Values parameter instead (in addition), which is at best awkward and opens too many cans of worms for my comfort right now. So I've just decided to just drop that idea for now; we might revisit it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants