You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the use cases of freeze-dry is to snapshot web pages in order to share them with others. If a page is personalised, e.g. a user snapshots their shopping cart of a web shop, the page may contain private information one would rather not share. If that information is visible, the user can notice it and choose not to share (or could edit the page with other tools). But if the information is hidden in the page, for example when a session ID or anti-CSRF token is stored in a hidden input field, they might accidentally share private information they could not see themselves.
I once heard that this risk of accidentally sharing hidden, sensitive information was one of the reasons for Mozilla’s PageShot experiment to finally not capture the DOM and only output a screenshot (despite the excellent work at capturing the DOM, similar to freeze-dry).
Freeze-dry already removes javascript, which removes one potential source of hidden information. We could also consider adding an option to remove <input type="hidden"> elements. And perhaps data-… attributes? Are there other invisible elements/attributes that are often used for sensitive data, and that we should thus consider to filter out?
Of course such a filtering approach will never guarantee cleanness, but it could probably weed out most of the cases. Interestingly, PageShot got a bit closer to a guarantee by taking the inverse approach: not cloning the whole DOM and filtering things out, but trying to only pick the elements and attribute types that it knows about.
Of course, in many use cases one may also want to remove everything that is invisible simply for reducing the size of the output. Ideally, various types of DOM transformations like these would not be implemented in freeze-dry itself, but could be plugged in. But I’ll park the issue here for the time being.
The text was updated successfully, but these errors were encountered:
One of the use cases of freeze-dry is to snapshot web pages in order to share them with others. If a page is personalised, e.g. a user snapshots their shopping cart of a web shop, the page may contain private information one would rather not share. If that information is visible, the user can notice it and choose not to share (or could edit the page with other tools). But if the information is hidden in the page, for example when a session ID or anti-CSRF token is stored in a hidden input field, they might accidentally share private information they could not see themselves.
I once heard that this risk of accidentally sharing hidden, sensitive information was one of the reasons for Mozilla’s PageShot experiment to finally not capture the DOM and only output a screenshot (despite the excellent work at capturing the DOM, similar to freeze-dry).
Freeze-dry already removes javascript, which removes one potential source of hidden information. We could also consider adding an option to remove
<input type="hidden">
elements. And perhapsdata-…
attributes? Are there other invisible elements/attributes that are often used for sensitive data, and that we should thus consider to filter out?Of course such a filtering approach will never guarantee cleanness, but it could probably weed out most of the cases. Interestingly, PageShot got a bit closer to a guarantee by taking the inverse approach: not cloning the whole DOM and filtering things out, but trying to only pick the elements and attribute types that it knows about.
Of course, in many use cases one may also want to remove everything that is invisible simply for reducing the size of the output. Ideally, various types of DOM transformations like these would not be implemented in freeze-dry itself, but could be plugged in. But I’ll park the issue here for the time being.
The text was updated successfully, but these errors were encountered: