Skip to content
This repository has been archived by the owner on Dec 4, 2019. It is now read-only.

Permanently delete url generated from webhint online scanner #216

Open
bhavik09071990 opened this issue Nov 6, 2018 · 4 comments
Open
Labels

Comments

@bhavik09071990
Copy link

We have used online webhint scanner to generate report for our website.

I would like to know if it is possible to permanently delete footprints of our website, or the url which was given to us by online webhint scanner. I would like to permanently delete that URL or delete any data specific to our website. Let me know if it is possible.

@molant molant transferred this issue from webhintio/rfcs Jan 8, 2019
@molant
Copy link
Member

molant commented Jan 10, 2019

Because anyone can submit any url we have to be careful on how we do the deleting. I can think of 2 things:

  • Opt-out of scanning. We could have something a la robots.txt that tells webhint not to analyze the website with the online scanner. This will be useful for websites that do not want anyone analyzing their site. This is good to prevent future scans.
  • On top of that, we could add a form that looks for that file/configuration and if present deletes all the previous ones.

@antross pinging you because we were discussing this earlier.

@molant molant transferred this issue from webhintio/hint Jan 16, 2019
@sarvaje
Copy link
Contributor

sarvaje commented Jan 16, 2019

Opt-out of scanning. We could have something a la robots.txt that tells webhint not to analyze the website with the online scanner. This will be useful for websites that do not want anyone analyzing their site. This is good to prevent future scans.

I see one problem here, how do you know that I'm the "owner" of the url I want to block?, What if I go and I block www.bing.com? What about a bot that blocks all the urls? (because people like to do these things).

Delete a results is one thing, but allow people to block some urls I think is dangerous.

@molant
Copy link
Member

molant commented Jan 16, 2019

Actually is the opposite. We shouldn't remove any result because we cannot verify they are the owners of the website but if the website has a robots.txt similar to the following then we are sure that we shouldn't scan that website:

https://example.com/robots.txt


User-agent: webhint.io
Disallow: /

We can be pretty sure that no other than an admin of that website has added that robots.txt (or whatever file we want to add) and we should respect that.

@sarvaje
Copy link
Contributor

sarvaje commented Jan 16, 2019

ahhh ok, I though you were talking to do that in our side. Then it is ok.

@molant molant added the Epic label Jan 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants