Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to try erase hidden information #55

Open
Treora opened this issue Sep 4, 2020 · 0 comments
Open

Option to try erase hidden information #55

Treora opened this issue Sep 4, 2020 · 0 comments

Comments

@Treora
Copy link
Contributor

Treora commented Sep 4, 2020

One of the use cases of freeze-dry is to snapshot web pages in order to share them with others. If a page is personalised, e.g. a user snapshots their shopping cart of a web shop, the page may contain private information one would rather not share. If that information is visible, the user can notice it and choose not to share (or could edit the page with other tools). But if the information is hidden in the page, for example when a session ID or anti-CSRF token is stored in a hidden input field, they might accidentally share private information they could not see themselves.

I once heard that this risk of accidentally sharing hidden, sensitive information was one of the reasons for Mozilla’s PageShot experiment to finally not capture the DOM and only output a screenshot (despite the excellent work at capturing the DOM, similar to freeze-dry).

Freeze-dry already removes javascript, which removes one potential source of hidden information. We could also consider adding an option to remove <input type="hidden"> elements. And perhaps data-… attributes? Are there other invisible elements/attributes that are often used for sensitive data, and that we should thus consider to filter out?

Of course such a filtering approach will never guarantee cleanness, but it could probably weed out most of the cases. Interestingly, PageShot got a bit closer to a guarantee by taking the inverse approach: not cloning the whole DOM and filtering things out, but trying to only pick the elements and attribute types that it knows about.

Of course, in many use cases one may also want to remove everything that is invisible simply for reducing the size of the output. Ideally, various types of DOM transformations like these would not be implemented in freeze-dry itself, but could be plugged in. But I’ll park the issue here for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant