Skip to content
This repository has been archived by the owner on Apr 6, 2020. It is now read-only.

Consider gathering anonymous analytics on disable / standard block rate #29

Open
pes10k opened this issue Oct 30, 2017 · 2 comments
Open

Comments

@pes10k
Copy link
Owner

pes10k commented Oct 30, 2017

Would need to make sure it was anonymous, didn't collect URLs (only aggregate statistics) and in general was respectful to users.

Would need a lot of discussion / planning to make right, but don't want to loose the discussion from issue #24

@jawz101
Copy link

jawz101 commented Oct 30, 2017

My first thought is to only attempt it on sites which provide a robots.txt file. But it would have to be something that was possibly a w3c standard itself since robots.txt is sort of a loose, "handshake deal" protection against web crawlers.

Probably the best references might be in however Internet archival sites choose to crawl the web. If they crawl with some sort of moral standard that doesn't try to compromise a user's or web site's security and infrastructure, maybe that would tell us how to consider what could be logged in a public manner.

@jawz101
Copy link

jawz101 commented Oct 30, 2017

My first thought is to only attempt it on sites which provide a robots.txt file. But it would have to be something that was possibly a w3c standard itself since robots.txt is sort of a loose, "handshake deal" protection against web crawlers.

Probably the best references might be in however Internet archival sites choose to crawl the web. If they crawl with some sort of moral standard that doesn't try to compromise a user's or web site's security and infrastructure, maybe that would tell us how to consider what could be logged in a public manner.

https://archive.org/help/aboutsearch.htm
http://archive.org/wayback/available?url=google.com

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants