Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matcher: add server name matcher #24

Merged
merged 7 commits into from
Mar 30, 2022
Merged

Conversation

kyessenov
Copy link
Contributor

This is intended to generalize SNI matching in TLS in Envoy's listener unified matcher implementation.
Signed-off-by: Kuat Yessenov kuat@google.com

Signed-off-by: Kuat Yessenov <kuat@google.com>
@kyessenov
Copy link
Contributor Author

cc @snowp @markdroth

Copy link
Contributor

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks fine to me, but I'd like input from @snowp as well.

xds/type/matcher/v3/domain.proto Outdated Show resolved Hide resolved
@kyessenov
Copy link
Contributor Author

Signed-off-by: Kuat Yessenov <kuat@google.com>
Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me for what it tries to do, just one question on whether we can simplify

Comment on lines 42 to 45
// Match a server name by multiple wildcard domain matchers. If there are
// multiple wildcards matching the server name, the order of declaration is
// used to select the first match.
repeated DomainMatcher domain_matchers = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to reduce the nesting here and leverage the ListMatcher to express this "first match wins"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works in tandem with the longest suffix so I guess not. E.g. if we have two matchers for *.com but also have example.com that takes priority.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so there can be overlapping domains coming from each entry in this list? At first glance it looked like a list of independent suffixes to match on, so maybe the docs can be clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second though, let's just make all wildcards distinct across all matchers. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put all the exact matchs first (any order), then the wildcard matches (sorted by length) to get the same behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key is that the implementation should be non-linear. Lists require traversal which doesn't scale well. This one can be done with an efficient compressed trie.

Signed-off-by: Kuat Yessenov <kuat@google.com>
@kyessenov
Copy link
Contributor Author

Any more input @snowp? Can this be merged?

snowp
snowp previously approved these changes Mar 15, 2022
Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@kyessenov
Copy link
Contributor Author

Gentle ping @markdroth .

@kyessenov
Copy link
Contributor Author

@markdroth This is one bit of lost functionality. Can you approve this so we can implement it in Envoy?

@mattklein123 mattklein123 self-assigned this Mar 29, 2022

// Matches a fully qualified server name against a set of wildcard domain
// names.
message ServerNameMatcher {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upfront, I completely agree we need something here to get parity with current Envoy config, and I have no problem with this being that solution.

Just wondering if we want a more general vs domain specific matcher. It seems like this could be a "PrefixOrExactMatcher" or something generic, where if there is a * we treat it as prefix, else exact?

I think the only behavioral implication would be that *w.example.com would be valid. I don't have any concerns with that though; its up to the control plane to reject that if it doesn't like it?

Just an idea, I don't have a strong opinion here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always add more sophisticated ones later. The * being an entire word is needed to create a trie, otherwise, we'd have to backtrack for some complicated patterns like *x.*y.*m.

A sophisticated template engine is hard to agree on. Google one is very opinionated https://github.com/google/http_pattern_matcher.

Copy link
Contributor

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM with small comments


// [#protodoc-title: Server name matcher]

// Matches a fully qualified server name against a set of wildcard domain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this can be used for exact name matching also, right? I'm not sure if anyone would use that, but will domains be validated that they have a wildcard in them in the appropriate place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exact names are allowed. The wildcard, if present, must be the leftmost symbol before the dot.

Comment on lines 41 to 42
// Match a server name by multiple wildcard domain matchers. Each wildcard
// domain must appear at most once across the domain matchers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify what happens if a non-wildcard domain is specified? Is that allows per above or will it be rejected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's allowed. I spelled out that exact are allowed.

// ``*``. Note that partial wildcards are not supported, and values like
// ``*w.example.com`` are invalid.
message DomainMatcher {
// A non-empty set of wildcard domain names, e.g. ``www.example.com``,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

www.example.com isn't wildcard, so a little confusing, and related to my other comments.

Signed-off-by: Kuat Yessenov <kuat@google.com>
@kyessenov
Copy link
Contributor Author

Thanks, updated the documentation to make it easier to follow.

Copy link
Contributor

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM with small comments

// the domain on a dot border. The wildcard matches one or more non-empty
// domain parts.
message DomainMatcher {
// A non-empty set of domain names with wildcards, e.g.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// A non-empty set of domain names with wildcards, e.g.
// A non-empty set of domain names with optional wildcards, e.g.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

// ``www.example.com``, ``*.com``, or ``*``.
repeated string domains = 1 [ (validate.rules).repeated = {min_items : 1} ];

// Match action to apply when the server name matches any of the wildcard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Match action to apply when the server name matches any of the wildcard
// Match action to apply when the server name matches any of the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// The server name will be matched against all wildcard domains starting from
// the longest suffix, i.e. ``www.example.com`` input will be first matched
// against ``www.example.com``, then ``*.example.com``, then ``*.com``, then
// ``*``, until the associated matcher action accepts the input. Note that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a partial wildcard? Clarify?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-worded, thanks. This was straight copied from envoy originally.

@markdroth
Copy link
Contributor

Sorry for the delay here.

LGTM, modulo comments from @mattklein123.

Signed-off-by: Kuat Yessenov <kuat@google.com>
Copy link
Contributor

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants