-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent leading dot tests for "example.com" #208
Comments
The tests seem to be ok, but the spec is not really clear: "Empty labels are not permitted" -> A leading or trailing dot implies an empty label, thus the check should return NULL. "meaning that leading and trailing dots are ignored", well... ignored if you check for the string being a public suffix. But 'checkPublicSuffix()' does not do that. The right argument is the 'shortest private suffix' that can be constructed from the left argument. And in that the test should fail if there is a leading or trailing dot. What we need is a test file with domains and a boolean value that says if the domain is a public suffix or not. IMO, there would be much less confusion. BTW: AFAIR, there is tests.txt obsoleting test_psl.txt (just easier to parse). |
@benkirzhner I think @rockdaboot essentially answered the question. Notice that in the first two lines, the second argument is How you should specifically handle this, it's an implementation detail that is not enforced by the PSL guidelines. In my Ruby lib I perform some pre-validations and return an error. PS. As @rockdaboot mentioned, I encourage you to use the new The tests are currently the same, but the PSS. By coincidence (and I'm kind of shocked) I started working on an Elixir library a few days ago (I already developed a Ruby lib and Go lib. I'd be happy to contribute if you want, I definitely don't want to double the effort if you are at more advanced stage. |
Since the tested behavior appears to be what's supported by existing libraries, we'll change our implementation to conform. Still, the spec's formal algorithm doesn't seem unclear or ambiguous:
"...leading and trailing dots are ignored" sounds like it means that
This sounds like a great idea in addition to the existing tests file. The business reason we have for building the Elixir library is so we can group URLs by shortest private suffix, and having explicit test cases for that behavior is useful.
Thanks for the link; we've already switched to the new one. FYI, the website currently links to the old test data file. Are there plans to update the website to link to the new file?
Our project is currently a private repo as we build out the initial implementation and as we going through the process of making it public. Once we get the thumbs up to open source, we'd be happy to coordinate and add you as a contributor. |
We have to distinguish between a rule defined as A rule cannot contain trailing dots. We have a test suite that ensures we don't incorporate such kind of rules by mistake. An input is, of course, dependent by the application itself. The constraint/suggestion is that you should pass an input without traling and leading dots. I will be happy to try to reformulate the docs.
@rockdaboot @benkirzhner can you elaborate? For the purpose of the algorithm, there is no difference between a private domain or a registry suffix. The process is the same. The semantic is currently assigned by the position in the list.
Thanks, I'll update it.
👍 |
@benkirzhner @weppos
libpsl uses [1] a hand-selected list of tests and [2] auto-generated tests from the PSL itself. While we could put the test data from [1] into a new file + plus your suggestions, [2] maybe should stay more of an algorithm !? I also could provide a script to auto-generate a file with all those tests from [2]. WDYT ? [1] test-is-public.c P.S.: the algorithm of [2] in short (not all yet implemented):
WDYT ? |
@rockdaboot can you provide a small example (a few lines) of how the test file would look like? |
I thought of a very simple format.
We could think about leading/trailing dots for
|
@simon-friedberger I think this can be closed, it is a very old issue. Unless this is still a prominent issue. |
Agree +1 to closing |
Hi! My team is building a publicsuffix library for Elixir and has run into an issue with the leading dot tests in the test file being inconsistent with the spec. The spec says "A domain or rule can be split into a list of labels using the separator '.' (dot). The separator is not part of any of the labels. Empty labels are not permitted, meaning that leading and trailing dots are ignored." This seems inconsistent with the following tests, which suggest that a domain with a leading dot should not match any rules:
and
Is there an inconsistency here, or are we understanding the spec and/or tests incorrectly?
The text was updated successfully, but these errors were encountered: