Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LuaLock failing to release, part 2 #238

Open
coandco opened this issue May 15, 2024 · 0 comments
Open

LuaLock failing to release, part 2 #238

coandco opened this issue May 15, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@coandco
Copy link

coandco commented May 15, 2024

After dealing with the new LockReleaseError in #222 (which was most of the problems I was having), I discovered that there's still a small percentage of the time that exiting a with LuaLock block will just never return.

Expected Behaviour

import asyncio
from coredis import RedisCluster
from coredis.recipes.locks import LuaLock
from coredis.exceptions import LockError

class TestLock(LuaLock):
    async def __aexit__(self, exc_type, exc, tb):
        print("Before aexit")
        await super().__aexit__(exc_type, exc, tb)
        print("After aexit")

async def main():
    rclient = RedisCluster(startup_nodes=[{"host": "example", "port": 6372}])
    try:
        async with TestLock(rclient, "examplename", blocking_timeout = 0.1, timeout=10):
            print("entered lock block")
            asyncio.sleep(1)
        print("exited lock block")
    except Exception as e:
        print(f"Hit exception {e!r} when using lock")
        return
    print("after lock")

if __name__ == "__main__":
    asyncio.run(main())

You should always see "before aexit" and "after aexit" if you saw "entered lock block", or you should hit the Exception print if there was an error.

Current Behaviour

On the same decently-large distributed system as in #222, I'm still very occasionally (~once/day) hitting a situation where I see "before aexit" but not "after aexit" and no "hit exception".

Steps to Reproduce

This is a production bug that I'm not sure how to reproduce reliably, other than having a lot of contention for locks. It does seem like it only happens when the lock is held for a very short period of time.

Workaround

For now, I'm considering the following as a workaround:

class Lock(LuaLock):
    def __init__(self, *args, release_timeout: int = 5, **kwargs):
        self.release_timeout = release_timeout
        super().__init__(*args, **kwargs)
    
    async def __aexit__(self, exc_type, exc_val, exc_tb):
        await asyncio.wait_for(super().__aexit__(exc_type, exc_val, exc_tb), timeout=self.release_timeout)

That way I can at least ensure that exiting the lock will always return within a set timeout value (and allow the lock to expire on the Redis side on its own), rather than sometimes just going out to lunch and never coming back.

Your Environment

  • coredis version: 4.17.0
  • Redis version: 6.0.16
  • Operating system: Debian 11
  • Python version: 3.9.2
@coandco coandco added the bug Something isn't working label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant