-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync client eventloop should not crash the process #6958
Comments
Errors should be reported to SyncClientConfig::default_socket_provider_thread_observer. |
Specifically catching MultipleSyncAgents and reporting it as a session error should be pretty easy to do and makes sense. The sync client mostly doesn't use exceptions to report recoverable errors, but that's an exception. |
➤ Jonathan Reams commented: Seems like this ticket is really about making it easier to consume MultipleSyncAgents exceptions. We could make the sync client pass an error to the sync error handler with a MultipleSyncAgents Status instead of throwing a MultipleSyncAgents exception if that would make things easier. Otherwise I don't think we can ever guarantee that the event loop will never crash the process. The current behavior is not broken, it's just not what you would prefer. |
I don't imagine we can ever guarantee 100% crash-free operation, though I'd argue that in the event of a crash being required, we should provide meaningful information to users. On a lot of the platforms we support, a sync client crash results in a message like |
➤ Jonathan Reams commented: Beyond the current logging and setting the thread observer so you get access to uncaught exceptions? |
I'm not sure if additional logging has been added since the user reported the issue in September, but back then, Core/Sync weren't logging the crash reason by default (per user's screenshot). Is there something the SDK should set to enable logging of crash reasons? Regarding the thread observer, I'm assuming you're referring to
handle_error is implemented in a way that this error is propagated to consumers before we crash the process? If that is indeed the case, then I think that should be sufficient and I can go ahead and create tasks for the SDKs to go ahead and do this. Is there a mechanism we can hook into to test that as it seems quite important that it's implemented correctly. I tried grepping for it, but couldn't find tests for the class I could use as inspiration, but it's possible I'm just not looking for the right thing.
|
➤ Jonathan Reams commented: Currently the only implementation outside the Java SDK is in the C API - you can look at the tests for that implementation here. Just an FYI, when you turn this on you'll lose the stacktrace of any thrown exceptions that would have been printed to stderr before and you'll just get the exception error code/message. Also, if/when you implement platform networking you won't have to do this because you can control the event loop and catch any exceptions you want yourself. Finally, do you also want the MultipleSyncAgents exception to be passed to the sync session error handler rather than being thrown? It's probably the exception that you have to actually catch as an exception from the sync client right now. |
I think for |
Currently, if an exception occurs in the sync client event loop, the process is terminated with very little - if any - information about what went wrong. Instead, the event loop should be structured in a way where operations being executed are tied to a Realm/session and any exceptions are logged, reported as fatal errors through the session error handler, after which sync stops for this realm/session, but remains active for any other sessions.
A very common example of the current broken behavior is when multiple processes try to open the same Realm file and get
MultipleSyncAgents
error. When this happens on Windows, the process crashes with no information about the cause of the problem or how to fix it.This is the second issue that manifested in this bug report: realm/realm-dotnet#3437.
The text was updated successfully, but these errors were encountered: