Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix likely deadlock issue in SafeClient #173

Closed
wants to merge 2 commits into from
Closed

Conversation

Hyodar
Copy link
Contributor

@Hyodar Hyodar commented May 22, 2024

From our testing data on operator state root messaging, it seems quite likely there's a deadlock happening in SafeClient for block submissions. Would be good to get an overall review on it. Trying to change some of the approaches that seem problematic while I get more info on it, at the moment:

  • Centralized resubbing in reinits, so there's no way a resub can happen while it's reinitializing
  • Made resubs not block the event listening from the subscription task

Good to highlight the rollup clients are only used for header subscriptions, so it's down to SubscribeNewHead specifically.

@Hyodar Hyodar requested review from emlautarom1 and taco-paco May 22, 2024 12:05
@Hyodar Hyodar self-assigned this May 22, 2024
@Hyodar
Copy link
Contributor Author

Hyodar commented May 22, 2024

One additional point here is - is it necessary to reinitialize the client? It makes sense to do it, but maybe it's not necessary. I'm starting to feel like it's not, and we could just simplify this module a lot.

@emlautarom1
Copy link
Contributor

How was this deadlock detected? Any way to reproduce in a testing environment?

@Hyodar
Copy link
Contributor Author

Hyodar commented May 22, 2024

As mentioned there, not totally sure what's happening yet. Couldn't reproduce it as well, currently trying to subscribe to headers locally and see if I can catch it. The symptoms are simply that randomly after some time an operator stops sending StateRootUpdateMessages, also trying to get more info on it from operator logs.

@Hyodar
Copy link
Contributor Author

Hyodar commented May 22, 2024

Setting this as draft for now (actually should've set as draft when creating - not meant to be open rn), as I couldn't yet reproduce the issue and I didn't see it happening again. Also, it's quite likely I'm going to simplify the module a bit - we don't actually need to reinitialize the client.

This could also not actually be an issue. My previous understanding was that the way reinit notifications are working could cause a deadlock when resubbing, but this doesn't make too much sense.

@Hyodar Hyodar marked this pull request as draft May 22, 2024 21:09
@Hyodar
Copy link
Contributor Author

Hyodar commented May 27, 2024

Closing as this is not relevant anymore as per latest changes.

@Hyodar Hyodar closed this May 27, 2024
@Hyodar Hyodar deleted the fix/client-deadlock branch October 3, 2024 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants