Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared cache is racy #36

Open
RReverser opened this issue Jan 3, 2024 · 2 comments
Open

Shared cache is racy #36

RReverser opened this issue Jan 3, 2024 · 2 comments

Comments

@RReverser
Copy link

Shared cache implementation currently takes a lock, checks whether item is absent, releases the lock, takes the lock again, and only then inserts the item.

The problem is that in some scenarios another call to the same function can come in between those operations and take the lock, leading to racing errors.

It should instead hold the lock for the duration of the entire function.

@dermesser
Copy link
Owner

The reason it's done like that is #3 - the lock cannot be held while calling the original function, because it might again call the memoizer.

So, either be racy or work on recursive functions. To me, the latter is much more useful, and typical for applications. If I understand correctly, the worst case of a race here is that the underlying function will be called twice instead of once, and the result (which will be identical) will be inserted twice. Which is not a terrible outcome, and not an error either (because all hashmap accesses are still synchronized).

What do you think?

@RReverser
Copy link
Author

RReverser commented Jan 3, 2024

If I understand correctly, the worst case of a race here is that the underlying function will be called twice instead of once

Yes, this is the problem, because I'm using memoization specifically to ensure that given function is called only once for given arguments (I'm working with a database, so unique key -> call really matters in this scenario).

For now I'm using custom memoization in a bunch of places with a manual static + Mutex + Option + HashMap in each function. I wanted to switch to this crate for nicer DX, but then ran into this issue where racing caused duplicate calls.

IMO guaranteeing that function is called only once for given arguments is usually a more important property of memoization than recursive calls, but I believe you can have both properties if you wrap not only HashMap itself, but also values into something like Arc<OnceLock<...>>.

Then you'd only need to hold a global HashMap mutex shortly - to insert an uninitialized OnceLock "cell", and after that you can init it - which is guaranteed to happen only once - outside the HashMap lock. This will allow other calls to the same functions to succeed recursively without deadlocking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants