-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify number and position of wildcard labels #145
Comments
This would not allow exceptions right now, since (AFAIR) the exclamation mark has to be the first character (and must not be followed by a dot). |
I believe the Mozilla implementation also didn't allow multiple wildcards. It seems like there are lots of implementations out there which don't. Given that their use is quite niche (what was the use case for the rule you added?), perhaps we need to bow to the inevitable and update the spec to match reality. |
Right, the Chrome code doesn't presently, but we could likely fix that. But in the implementations I've seen, I haven't seen any support for multi-wildcards, despite what the website says. |
Hi As you can see from our pull request here: we need to get the double wildcard issue addressed, as we need our customer domain suffix in the PSL. We name machines under MACHINE.GROUP.ACCOUNT.CLUSTER.bigv.io, so ...bigv.io would be ideal, or *..uk0.bigv.io. Thanks |
When you say "get it addressed", do you mean "update all the PSL-using code in the universe to support double wildcards"? It seems to me that the most likely way for this to be addressed would be to document what appears to be a fairly universal client limitation - that is, that double wildcards are not supported. |
@gerv @sleevi I'm curious, what would be the implications of having one of those entries in the list as of today? Would it cause some sort of crash, or will it simply "not work as expected"? I'm asking because it may make sense in the long run to support it. And if this is the direction we want to try to take, we may simply document it as "compatibility not guaranteed". Clients and consumers will eventually update their codebase. In other words, we allow, but we document as "it may not work on all clients". Different is if rolling out today would cause Firefox or Chrome to potentially crash. What do you think? |
I suspect the answer is implementation-dependent... We'd have to try it and see. |
@weppos I don't think we can make a demarcation between "crash" and "not-work". That's the sort of decision PSL consumers should take. For example, how would we quantify if a double-wildcard caused entries subsequent on the PSL to be ignored due to parsing issues? We don't have the means to quantify that. That said, I'm not fundamentally opposed to changing, so long as major platforms will end up supporting them. |
@benkirzhner and I are working on an Elixir public suffix library and have found it much, much easier to support only a single left-most wildcard than to support multiple wildcards or wildcards at any position. If we only support a single wildcard at the left-most position, our algorithm can do a very simple, constant-time set membership check (e.g. Given that up to now, the only usage of wildcards in the data file has been in the left-most position...is it really necessary to add extra complexity to every implementation to support wildcards at any level and multiple wildcards? If it is decided to support the added complexity, I would ask that the tests be updated to include examples of multiple wildcards and wildcards at less common positions so implementors can use those in their test suites. |
*.label and *.*.label would allow fast lookups as we do it now. One or more wildcards somewhere inbetween (e.g. foo.*.bar) result in slower and more complex lookup algorithms. Despite from that, we need new exception rules. |
@rockdaboot @gerv @sleevi given:
barring any objection I am going to update the format documentation and the site to permit the use of one single wildcard, and only as the left outermost label. Should the need change, we can always revisit the decision. But it looks like there is no current practical application of it. |
@weppos Is there any update on this (updating the documentation)? (There's also another correction outstanding, from #208.) Inline wildcards pose a bunch of problems, as discussed above. To add one more: It's near impossible to incorporate inline wildcard rules in the PSL DNS lookup service that I set up, due to the constraint that in the DNS, wildcards can only appear in the leftmost label. I would feel a bit better about the "spec coverage" of the service if it was official that inline wildcards don't need to be considered. |
Nudging this. @weppos wrote:
I support this change and agree this should be done. I will be working on documentation for #982 and can update the documentation to reflect what Simone said above within the PR I make for that once we settle on the wording for it - @sleevi any objection? |
None here. |
Excellent @sleevi, thx. Without letting my bias for action while I have some cycles to donate seem too cavalier, I think there is an opportunity to proceed. Considering that @weppos proposed the idea, I feel it safe to count that as a non-objection, and I think it is smart. Concensus. I just hate closing the ones with Gerv's notes in them :( |
Green light here too. Thanks for the help @dnsguru |
I had proposed to update this with the other PR #982 but that is on the wiki and this is up on publicsuffix.org, so I did some updates that I'd like both of you @weppos and @sleevi to review, and then I will do the needed PR for those in that repo. The following is the update I want to make, and I want to just be sure I have it right. From "Specification" in https://publicsuffix.org/list, with removed text in Specification
Examples of valid entries and ! or * / wildcard usage:
Examples of invalid entries and ! or * / wildcard usage:
|
@weppos wrote:
No objection. We only handle single, leftmost wildcards so the documentation change is very welcomed. |
"must be delimited by a dot on its right." sounds like |
Good catch.
So revising it to "...must be ***by itself, and*** be delimited by...", and
adding your example to the invalid examples table would address this.
…On Mon, Mar 2, 2020, 1:38 AM Peter Thomassen ***@***.***> wrote:
"must be delimited by a dot *on its right.*" sounds like something*.
example.org would be a valid rule. That's probably not intended?
|
I suggest we take a look at the DNS RFC and how this is described. We are basically shifting to the same behavior.
(assuming we defined what a label is) |
If we were to take bat.bar.foo and say that each of bat, bar and foo are 'labels' within the definition of Mockapetris, RFC1034, section 4.3.3. "Wildcards" p25 * usage in a zone (paraphrasing this) but it states the wildcard is, "...always whole labels". It also goes on to define the explicit placement in the zones. Maybe we just reference that the syntax for wildcards ("*") usage in the PSL are identical to those as defined in RFC1034 section 4.3.3, where it is
|
The final version of the bullet about wildcard use would read like this:
|
That sounds really flexible, and the voluntary nature of downstream use or incorporation of the PSL is that it works like a buffet, essentially, where folks can put what they want on their tray or add their own. This catalog and the maintainers (speaking for myself for sure) are completely non-prescriptive about what folks do downline from the PSL, but because the left-most only, single position wildcard is how DNS behaves, and this is the widest used approach that long-term PSL consumer/integrators have come to know and expect, we're really following that here due to the primary/legacy compatibility. |
@dnsguru I'm not personally against this change. I totally understand the motive behind it. I was merely pointing out that there might be users out there who might be impacted by this change, though I'm not aware of any. |
We've not received any reports in the past 8 months on affected parties related to compressing to the single, leftmost-only wildcard rules, so this will be closed. |
Do I understand correctly that based on the wide consensus and the lack of objections / reports of affected parties, the spec will now be adapted to the phrasing of #145 (comment) (i.e. close + fix)? For reference,
|
Yes, with a microtweak to specify the unicode character for the asterisk... and I have added this onto the Wiki here in the format section, hoping to deprocate use of the legacy publicsuffix.org references by referring them to the github.com/publicsuffix/list wiki instead so that the content is more easily maintained within this project. |
Awesome! 🎉 Thanks for the work. |
Spec was changed such that what was previously a limitation of our implentation is no longer allowed, so no limitations remain. publicsuffix/list#145 (comment)
Spec was changed such that what was previously a limitation of our implentation is no longer allowed, so no limitations remain. publicsuffix/list#145 (comment)
I made a commit a few weeks ago that introduced a rule like
*.*.private.domain
and the commit caused the build to fail.According to our website, that is supposed to be a valid format:
@rockdaboot mentioned a potential incompatibility of Chromium if we allow multiple wildcards.
libpsl
is currently not compatible with multiple wildcards, and to be fair I haven't tested my Ruby implementation either.@gerv @sleevi can we clarify whether multiple wildcard labels are accepted? Specifically, we should be more clear if the following rules are valid:
The current list definition doesn't explicitly deny these rules, they are supposed to be valid.
Once the decision is taken, I think we should:
The text was updated successfully, but these errors were encountered: