Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Posthog: Stop double reporting UTD events when app is relaunched. #27421

Closed
uhoreg opened this issue May 1, 2024 · 3 comments · Fixed by matrix-org/matrix-react-sdk#12501
Closed
Assignees
Labels
A-E2EE A-Telemetry Telemetry / analytics to understand usage S-Minor Impairs non-critical functionality or suitable workarounds exist T-Defect Team: Crypto

Comments

@uhoreg
Copy link
Member

uhoreg commented May 1, 2024

This is the issue for the Element Web portion of element-hq/element-meta#2333

Summary: Element Web does not persist what events have been reported as UTDs, so when it restarts, UTDs get re-reported if it tries to decrypt them again, leading to over-reporting of UTDs.

@uhoreg uhoreg added Team: Crypto A-E2EE A-Telemetry Telemetry / analytics to understand usage labels May 1, 2024
@uhoreg uhoreg self-assigned this May 1, 2024
@dosubot dosubot bot added S-Minor Impairs non-critical functionality or suitable workarounds exist T-Defect labels May 1, 2024
@uhoreg
Copy link
Member Author

uhoreg commented May 2, 2024

Looking at the existing (commented out) code to do this, it tries to store all the tracked event IDs as an array in localStorage. I tested Firefox and Chrome for the storage limit of a single localStorage entry, and they both can store up to about 5200000 characters. Event IDs in v3+ rooms are 44 characters, plus two quotation marks and a comma if we store it as a JSON array, so 47 characters per event ID. That means that we would be able to store 110.6k event IDs. Which is probably enough.

But we should probably have some sort of handling for when we do run out of space. The most reasonable thing to do would be to evict the oldest events (whether by origin_server_ts, or by when we first tried to decrypt the event), with the rationale that if an event is really old, it's less likely that we'd be trying (and failing) to decrypt the event, since the user probably won't be scrolling back that far. But if we're storing a Set as a JSON array, as the current code does, we don't have a timestamp. Storing the events as a map from event ID to timestamp would add 14 characters if we use millisecond-precision, or 11 characters if we use second-precision (including the separator between the event ID and the timestamp). Second-precision is probably good enough, so that would give us 58 characters per event, and we'd be able to store 89.6k events. Which is still probably enough.

The alternatives would be to:

  • have one entry per event in localStorage, rather than storing them all in one entry. But it's probably good to have some sort of cap on how much storage we use.
  • store the events in indexedDB. But that would make things async, while the current code is all synchronous.

@richvdh richvdh changed the title Posthog: Stop double reporting UTDs events when app is relaunched. Posthog: Stop double reporting UTD events when app is relaunched. May 2, 2024
@t3chguy
Copy link
Member

t3chguy commented May 2, 2024

@uhoreg keep in mind the space is likely not per-key but for the whole Storage object, and there's no way to detect when you're approaching the limit, only when you cross it. The quota estimation API is often wildly wrong.

@uhoreg
Copy link
Member Author

uhoreg commented May 2, 2024

Yeah, one thing we could do is to impose our own limit on how many keys are stored. It would be bad if we used up all of localStorage for this, and then something more important tried to use localStorage and failed.

Another alternative that was suggested was to use a probabilistic data structure such as Bloom filters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-E2EE A-Telemetry Telemetry / analytics to understand usage S-Minor Impairs non-critical functionality or suitable workarounds exist T-Defect Team: Crypto
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants