Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add referrer spam blocklist #123

Closed
Zodiac1978 opened this issue Jan 28, 2019 · 10 comments
Closed

Add referrer spam blocklist #123

Zodiac1978 opened this issue Jan 28, 2019 · 10 comments

Comments

@Zodiac1978
Copy link
Member

Proof of concept plugin:
https://gist.github.com/Zodiac1978/093244d0b3837cc83c37c2a2a2dc15ea

@Zodiac1978
Copy link
Member Author

I am not sure how to proceed her.

Option 1: New plugin with just this option, like "Blacklist Updater"

Option 2: Include in Statify Blacklist: stklcode/statify-blacklist#18

Option 3: Include in main plugin, optional

Option 4: Include in main plugin, non-optional

What do you think?

@Zodiac1978
Copy link
Member Author

And I am not sure about caching/saving.

Blacklist Updater is using an option and wp_remote_get - I am not a hardcore developer so I am unsure if this is the best way here. What about using fileinstead of wp_remote_get and using a Transient instead of an an option?

@websupporter
Copy link
Contributor

The file is under public domain which is good.

I would include the file into the plugin. The disadvantage of such an approach is of course, if you want to update the file, you need to run an update. The advantages in my eyes are

  • You do not have to call an external source and wait for it.
  • You are in control of the actual content

As for the options: I like no. 4 and no. 2

Another way could be to include it, but maybe to add something like an update button for this file, which then would fetch. I just don't like this idea of auto-fetching a 3rd party file.

@Zodiac1978
Copy link
Member Author

I would include the file into the plugin.

The disadvantage is too big IMHO. If we add it to Statify itself and not Statify Blacklist our only way is to skip tracking, so we need recent information. If the referrer is saved there is no way to clean up without Statify Blacklist. Therefore I think we need to decouple this from plugin updates.

I just don't like this idea of auto-fetching a 3rd party file.

"Our" own plugin Blacklist Updater is exactly doing this and just this. And Matomo is an Open Source Project and the list is Public Domain (CC0) and community driven. And maybe we can be listed as Software using this list, to get some awareness for Statify. I don't see any problems. Any performance issues can be solved through internal caching/saving as option value or transient with a Cronjob running every 12 or 24 hours, I think. Still issues with it?

@websupporter
Copy link
Contributor

The latency problem can be resolved by caching.

The Matomo project is great. I do not have any issues with the project or mistrust this project in particular. I just mistrust 3rd party auto-fetching, as it opens up a door into the plugin, we do not control.

I just checked how often the project gets updated. This is almost on a daily level. So I see, how updating via plugin update is not a feasible option. If there are no other voices chimming in against this 3rd party fetching, I do not want to stop a really cool feature. Though instead of option 4 I would now vote for option 3 (in statify, but optional).

Keep in mind: If the Matomo project's password is lost to someone: Once our direct integration is shipped, for those Statify-versions, we have no turn-it-off. Sure, we would update the plugin and all, but it's an uncomfortable thought.


"Our" own plugin Blacklist Updater is exactly doing this and just this.

Ha, I wasn't aware of this 😆 Doesn't make me more comfortable though.

@krafit
Copy link
Member

krafit commented Feb 1, 2019

I'd opt for Option 3

We could solve websupporters valid security concerns by tunnelling it through api.pluginkollektiv.org and let the server auto-fetch and cache the file, sanitise it and allow clients to get the data via our API.

By doing so, we can ensure the datas integrity, and we are free to switch the source for our blacklist in the future.

@Zodiac1978
Copy link
Member Author

tunnelling it through api.pluginkollektiv.org

I wouldn't call that more transparent/secure than the public Github file of a well known company ... and it creates a bottle neck, because no other one from the team has control over the API server.

@krafit
Copy link
Member

krafit commented Feb 1, 2019

I wouldn't call that more transparent/secure

Well, it's as transparent as we make it. And regarding security: the moment you open WordPress to get data from a third party service (no matter if GitHub or pluginkollektiv.org) you introduce some level of insecurity. IMHO routing requests through our own server adds a layer of security in comparison to the direct request to GitHub.

and it creates a bottle neck

Thats a organisational rather than a technical topic. We might want to start a general discussion on bottle necks on Slack, there are plenty.

@Zodiac1978
Copy link
Member Author

😞

@Zodiac1978
Copy link
Member Author

Closing this, because it was a nice idea with a proof of concept, but ...

  1. Seems to be not a serious problem for the people, because no one is pushing this issue. And not much user are complaining.

  2. The whole approach doesn't make much sense. Even if the repository gets frequent updates, the list is always outdated and incomplete by design. We need the information at the moment of the view. If it is already tracked their is not much sense to get the updated list, because Statify itself has no cleanup routine. We are always too late to the party.
    Non spam in list? matomo-org/referrer-spam-list#1144

  3. There must be a better technical solution to this problem without relying on a list which needs to be updated "one single domain per pull request". We do not win this race with a millstone around our neck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants