Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whitelist Wikimedia domains #2334

Closed
MusikAnimal opened this issue Mar 22, 2019 · 13 comments · Fixed by #2352
Closed

Whitelist Wikimedia domains #2334

MusikAnimal opened this issue Mar 22, 2019 · 13 comments · Fixed by #2352
Labels
broken site MDFP Multi-domain first parties: lists of domains that should be treated as related to each other

Comments

@MusikAnimal
Copy link

What is your browser and browser version?

Chromium 72.0.3626.121 Ubuntu

What is broken and where?

https://tools.wmflabs.org/siteviews/?sites=en.wikibooks.org which reports pageviews to Wikimedia sites. Should it be of any concern, Siteviews is not itself a tracker of any sort, nor are any of the Wikimedia sites that it queires.

What is the "culprit" domain?

In this case, https://en.wikibooks.org, but I think all Wikimedia domains may be affected.

What is your debug output for this domain?

**** ACTION_MAP for wikibooks.org
en.wikibooks.org {
  "dnt": false,
  "heuristicAction": "block",
  "nextUpdateTime": 1553349523134,
  "userAction": ""
}
wikibooks.org {
  "dnt": false,
  "heuristicAction": "block",
  "nextUpdateTime": 0,
  "userAction": ""
}
**** SNITCH_MAP for wikibooks.org
wikibooks.org [
  "wikipedia.org",
  "wikimedia.org",
  "wikidata.org"
]

I first reported this issue in an email exchange with extension-devs@, and was informed all Wikimedia sites have been whitelisted, and indeed they appear to be:

["wikipedia.org", "wikimedia.org", "wikimediafoundation.org", "wiktionary.org",
"wikiquote.org", "wikibooks.org", "wikisource.org", "wikinews.org",
"wikiversity.org", "mediawiki.org", "wikidata.org", "wikivoyage.org"],

I only just today noticed that en.wikibooks is blocked. However, as the author of the above tool (Siteviews) I can say that occasionally I get bug reports of the results not loading for other sites (such as https://en.wikiversity.org), and it is often because the user had Privacy Badger installed. This leads me to believe that somehow Wikimedia is still having some conflicts.

I am using Privacy Badger 2019.2.19.

@bcyphers
Copy link
Contributor

Hi @MusikAnimal, thanks for filing this.

and was informed all Wikimedia sites have been whitelisted, and indeed they appear to be:

Wikimedia domains are on the "multi-domain first party" list, which informs Privacy Badger about domains that are owned by the same first party. As a result, "wikibooks.org" will never be blocked on "wikipedia.org," and vice versa. However, the MDFP list does not affect the behavior of other first-parties, like wmflabs.org. Therefore, if PB learns to block wikibooks.org, it will not be blocked any of the wikimedia domains, but it will still be blocked on wmflabs.org.

What's confusing here, though, is your snitch map. The wikimedia domains have been included in the MDFP list since it was installed back in 2016; see #781. There should be no way that "wikibooks.org" was seen as "tracking" on any of "wikipedia.org", "wikimedia.org", or "wikidata.org". How long have you had Privacy Badger installed?

@bcyphers
Copy link
Contributor

bcyphers commented Mar 25, 2019

To follow up on the last comment: if you've had PB installed since before we added the MDFP list, your badger might have learned to block wikimedia domains before that time, and never "un-learned" to block them if you didn't reset Privacy Badger's local storage. We can fix that problem by adding a migration to Privacy Badger's startup procedure, as we have with the yellowlist.

However, if you installed it more recently, PB never should have learned that "wikibooks.org" was tracking on other wikimedia domains. In that case, there may be a bug with MDFP domains or tracking attribution that we need to look into more.

Edit: my assumptions here were wrong, see #2351

@MusikAnimal
Copy link
Author

@bcyphers I've had Privacy Badger for a while, probably predating #781. All I know is I continue to get reports about WMF sites being blocked by Privacy Badger, so presumably these people have also had it installed for that long. I already have Privacy Badger on all the compatible browsers, so I'm currently unable to test it with a fresh install. Thanks for looking into this issue!

@bcyphers
Copy link
Contributor

bcyphers commented Apr 1, 2019

Thanks! Hopefully a migration will fix the issue for you and your reporters. I'll start work on a PR later today.

@bcyphers
Copy link
Contributor

bcyphers commented Apr 19, 2019

Hi @MusikAnimal, turns out I was wrong about how PB works -- being on the MDFP list doesn't stop PB from learning that a domain is a tracker. See #2351. That explains your snitch map.

The MDFP list should prevent PB from actually blocking wikimedia domains on, say, https://en.wikiversity.org/. So there is probably another bug somewhere, but I'm not sure what it is. Have you personally experienced Wikimedia domains being blocked?

@MusikAnimal
Copy link
Author

Have you personally experienced Wikimedia domains being blocked?

Yes, wikibooks.org, and I also just tried en.wikiversity.org and that is also blocked:

**** ACTION_MAP for wikiversity.org
en.wikiversity.org {
  "dnt": false,
  "heuristicAction": "block",
  "nextUpdateTime": 1555993646536,
  "userAction": ""
}
wikiversity.org {
  "dnt": false,
  "heuristicAction": "block",
  "nextUpdateTime": 0,
  "userAction": ""
}
**** SNITCH_MAP for wikiversity.org
wikiversity.org [
  "wikipedia.org",
  "mediawiki.org",
  "wikimedia.org"
]

but again I've had Privacy Badger for quite some time.

I just tried in a different browser with a fresh install, and no Wikimedia domains appear to be blocked :) However, as a possible separate issue, they are still listed as "potential trackers", and wikimedia.org specifically defaults to the no-cookies option. If I go for instance to github.com, I see github.githubassets.com is under "The domains below don't appear to be tracking you". My hope is all the Wikimedia domains could live there, too :)

Here is the snitch map for wikimedia.org on my fresh install of PB:

**** ACTION_MAP for wikimedia.org
wikimedia.org {
  "userAction": "",
  "dnt": false,
  "heuristicAction": "cookieblock",
  "nextUpdateTime": 0
}
**** SNITCH_MAP for wikimedia.org
wikimedia.org [
  "wikipedia.org",
  "wiktionary.org",
  "wikisource.org"
]

@bcyphers
Copy link
Contributor

bcyphers commented Apr 21, 2019

Thanks. Can you give me a specific URL where wikibooks.org is blocked?

These domains will be added to the snitch map (#2351), but they should not be blocked when you navigate to a wikimedia domain (thanks to the MDFP list).

@MusikAnimal
Copy link
Author

Can you give me a specific URL where wikibooks.org is blocked?

https://tools.wmflabs.org/siteviews/?sites=en.wikibooks.org

You can use that same tool to test any other Wikimedia wiki. Again nothing is blocked on a fresh install of Privacy Badger.

Thanks!

@bcyphers
Copy link
Contributor

For a quick fix, we can add wmflabs.org to the Wikimedia MDFP list. We should still address #2351 in the long term, but it's not necessary for this issue.

@bcyphers
Copy link
Contributor

Opened #2352 to fix the immediate issue. Are there any other wikimedia-maintained domains that you've had this issue with @MusikAnimal?

@MusikAnimal
Copy link
Author

Opened #2352 to fix the immediate issue. Are there any other wikimedia-maintained domains that you've had this issue with @MusikAnimal?

None that I'm aware of. Thank you so much for the help! :)

@ghostwords ghostwords added the MDFP Multi-domain first parties: lists of domains that should be treated as related to each other label May 22, 2019
@MusikAnimal
Copy link
Author

@bcyphers I am still getting reports about Wikimedia domains being blocked. https://meta.wikimedia.org/wiki/Talk:Pageviews_Analysis#Pageviews_bug_report (permalink)

Was #2352 supposed to affect older installations of Privacy Badger?

@ghostwords
Copy link
Member

We haven't released the fix yet, but we will, in the next week or two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broken site MDFP Multi-domain first parties: lists of domains that should be treated as related to each other
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants