-
-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any single label is considered a TLD #48
Comments
According to the documentation: "This function checks if domain is a public suffix by the means of the Mozilla Public Suffix List." and "localhost" is not a public suffix, so I would argue that the function should then return false. |
Algorithm , Point 2. Says if no rules match, the prevailing rule is "*". |
Hi, which document are you referring to? Thank you! |
The more I think about all that, it seems that such an API should only be used with FQDNs. Say, you query a host named ...and would the server send a cookie with /cc @bagder |
In cookies the domain attribute is tail matched with the host name used in the URL and I don't think there's ever any guarantee that it is in fact a FQDN. It is even not that easy to figure out since the name resolving functions etc don't tell us that very easily. But yes, the PSL matching pretty much assumes that we give it a true and global domain name. Which, if we access public URLs, we will - but in private circumstances we may not always... |
Looking at the provided test data, it seems that single-label names do not match at all: |
Hi, sorry for latency... (and short answer from mobile phone)
@bagder What you say. In a private area the PSL might not always do the right thing. |
Any check of a single label. |
No, there's currently no way to switch off PSL in curl but there probably should be, for example the cases where users run it within organizations using custom domains etc. I think it hasn't been that widely used yet. But can you elaborate on why the function should return TRUE on a domain that clearly is not listed as a public suffix list? I don't understand how that can be the job of this function. |
but "*" is not part of the public_list ;) Quick test from a browser, cookie from non qualified domain are accepted, which is very common. |
@bagder looking at other consumer of libpsl (e.g. wget), it seems using psl_is_cookie_domain_acceptable could be a better solution than psl_is_public_suffix.
|
Maybe, yes. The irony here is of course that @rockdaboot himself wrote the libcurl adaption that uses libpsl =) |
We could certainly work around this issue in curl that way, but that won't make |
@bagder There must have been a reason for using psl_is_public_suffix at the time I wrote the patch. Sadly, I hardly remember and do not have the time to investigate right now. But today, I would use psl_is_cookie_domain_acceptable() in a cookie context, as @remicollect correctly found out. I am willing to have a look at curl code the next days, if that is fine for you.
The documentation has to be clearer. I won't work against the proposed PSL... if there is something wrong or unclear in those rules, we have to open an issue at https://github.com/publicsuffix/list, asking for clarification. There are a few points unclear, see publicsuffix/list#145.
@remicollect, this is of course accepted and circumvents a check against the PSL (due to localhost==localhost). The real question is if xyz.localhost may set a cookie for localhost. The PSL rulez say NO. I personally would say default=NO, but YES if user allows it explicitely. But that is beyond libpsl and should be handled at application level. IMO, there are these TODOs:
|
Why? "localhost" is not a public suffix, so why does the PSL rules limit non-PSL domains? |
The 'checkPublicSuffix' function checks for the 'shortest registrable domain part' of a given input domain, e.g. checkPublicSuffix('COM', null); means '.com' is not registrable (it is a public suffix). |
Ugh. I find that very counter-intuitive and strange. But sure, it explains the functionality. |
Yes, it is implicit. Because of the rule mentioned above.
"localhost" is a PS, same explanation as above (prevailing * rule). Please look at https://publicsuffix.org/list/ if you don't believe me. I am not aware of a list of exception (e.g. private domains). Why not opening an issue or PR to add "!localhost" to the PSL ? An official decision is beyond libpsl. But I am fine to discuss adding libpsl functionality regarding private PSL rules. E.g. having a second, private list of rules that can be added/removed from the PSL. A user could simply add e.g. "!localhost" or for testing, remove/overwrite existing rules. So we cover all kinds of private and 'exotic' PSL usages. WDYT ? If you like, open another issue just for this... I guess some details have to be discussed there. |
My surprise is not that "localhost" specifically isn't listed as a PSL. My surprise is that an API for PSL (being "Public Suffix List" - a list of public suffixes) gives back a response about a domain that is clearly not specified as a PSL. I think it is outside of PSL's jurisdiction. I would claim that the limitations on a "single label domain" is not a PSL job to enforce. If you're just following the PSL guidelines than my beef is with the PSL guidelines and I'm just here barking up the wrong tree. I'm thankful for your work and your library, don't mistake my complaining for anything else. |
As already mentioned on the PSL site, this "algorithm" is merely a plumber-ed list. For me the key question remains, why would any implementation say, "yes,
Then any single-label rule from the PSL could be spared. Anyway, thanks for your time! ;) |
@bagder as @rockdaboot correctly mentioned, there is a specific rule in the PSL algorithm that says:
And there are corresponding tests:
Which means, if a TLD is not listed, it should be considered a "standard" TLD. I did not make that rule, hence I can't share the exact initial motivations, however my assumption is that it was done because the PSL is considered "a list of specific configuration" to be applied on top of the standard practice where the domain is third-level.second-level.tld. This potentially makes possible to clear from the list all standard suffixes (e.g. in a pre-processing phase). Moreover, the list will not cause any denial of service to new TLDs (think about all the new GTLDs) or to TLDs that for one reason or another were not listed (although I'm quite sure we currently include all the TLDs that were available at ICANN before the newGLTD phase). I agree that this rule is very centric to the idea of public web and it may cause conflicts when the PSL is used within a local network context. However, it's also dependent on the specific usage and implementation that the library makes of the list. For e.g., the libpsl (or any other lib) may decide to provide a flag to use I hope this provides a little bit of extra context. If you should decide to suggest some specific changes to the PSL, I agree with @rockdaboot that the best thing to do is to open a ticket in the PSL repo itself. |
I still disagree. A single label is indeed a TLD, but if it isn't listed in the PSL I disagree that it should be returned as such. |
Same feeling here. |
So you opened an issue (discussion) at https://github.com/publicsuffix/list, where it belongs !? Background is that the PSL is an (optional) tool to prevent leaking of privacy via cookies AND that not all TLDs are listed in the PSL (e.g. when new ones come, it will take a while before an updated PSL is spread around everywhere). In some scenarios (e.g. secured local network, testing purposes), the PSL isn't appropriate and you don't want to use it. You will know when it isn't the right tool and should have a switch (e.g. command line option) to turn it off. Also, libpsl supports loading your own version of a PSL - you could bring in your own exceptions, thus have a very fine grained control (this also needs some support via curl). If you think, there are other - automated - possibilities to detect when the PSL should be used and when not, you could give us a hint. But I believe such measurements (e.g. DNS lookup to see if IP is private) belong to the application level. |
@rockdaboot ultimately it's your decision on how to tweak the lib, but what about a flag that allows the developer to optionally decide (or switch) the behavior when a rule doesn't match? This is for example what I did here: By default, I return a |
Back to start. As I understand, @m6w6 had the problem that server 'localhost' was not able to set a cookie for domain 'localhost'. That was a bug due to not calling psl_is_cookie_domain_acceptable(). I provided a patch for curl to get this fixed (curl/curl#658). The questions that remains is if 'host1.localhost' may set a cookie for 'localhost' - this is called a super-cookie. If this cookie will be accepted, the next request to 'host2.localhost' will contain it. Information is transfered from host1 to host2. IMO, If you explicitly want this behavior, just switch PSL off via e.g. a command line switch. BTW, you can play around with the PSL using the 'psl' command from libpsl, e.g.:
|
I have to add that the reason that 'host1.localhost' MUST NOT set a cookie for 'localhost' has nothing to do with the PSL. It is the RFC 6265 that disallows it. The code in the function psl_is_cookie_domain_acceptable() is: cookie_domain_length = strlen(cookie_domain);
hostname_length = strlen(hostname);
if (cookie_domain_length >= hostname_length)
return 0; /* cookie_domain is too long */ |
Is there a rationale behind this behavior? Why not use the existing TLD list instead of a match-all rule[1]?
I'm not sure the following statement holds truth:
psl_is_public_suffix(ctx, "localhost")
Currently, that means that libcurl drops any cookies from localhost etc.[2] when built with libpsl support.
[1] https://github.com/rockdaboot/libpsl/blob/master/src/psl.c#L810
[2] https://github.com/curl/curl/blob/master/lib/cookie.c#L801
/cc @remicollet, @bagder
The text was updated successfully, but these errors were encountered: