Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.lock file should not contain information about sync agents #6957

Open
nirinchev opened this issue Sep 7, 2023 · 4 comments
Open

.lock file should not contain information about sync agents #6957

nirinchev opened this issue Sep 7, 2023 · 4 comments

Comments

@nirinchev
Copy link
Member

nirinchev commented Sep 7, 2023

If a sync process crashes or is otherwise terminated, the SharedInfo stored in the lockfile will keep the sync_agent_present flag raised, meaning other processes won't be able to open the Realm file until the lockfile is deleted. Instead, we should devise a mechanism that ensures that when a process is terminated, there are no leftover flags and new processes can start using the file again.

@fealebenpae suggested storing an empty file in the management directory which is owned by the sync agent process and is released when the process is terminated, allowing for other processes to take over ownership, but other approaches are similarly valid.

This would solve the underlying issue causing the crashes reported in this ticket: realm/realm-dotnet#3437.

@tgoyne
Copy link
Member

tgoyne commented Sep 7, 2023

The lockfile only stays alive as long as the Realm file is open in at least one process, so this is only a problem in the very specific scenario of opening the Realm in a non-agent process and holding the Realm open while the agent process crashes and then restarts. Multiprocess sync will make this whole problem go away entirely, so it doesn't seem worth trying to fix this edge case.

@nirinchev
Copy link
Member Author

I agree - it's an edge case, but we've seen it reported multiple times, especially on Windows/Unity scenarios where the developer keeps the Realm open in Studio and restarts their application multiple times during the development process.

If multiprocess sync is coming in the near future, that will definitely be a preferable solution and we can close this in favor of just not throwing, though we'd still need to make sure whatever mechanism we devise for coordination between the sync agents accounts for the possibility of process crashes and we don't end up in a similar situation where a terminated sync agent is considered the primary one.

@tgoyne
Copy link
Member

tgoyne commented Sep 8, 2023

Dealing with the sync agent being suspended or terminated is the primary hard part of multiprocess sync and is why I'm expecting it to take a few months.

We currently haven't found a robust way to handle cleanup when one process in a session crashes even outside of sync. If a process crashes while holding the write lock then it'll only usually result in the write lock being released. Any versions used by the process which crashed will be leaked until the session ends.

Copy link

sync-by-unito bot commented Nov 27, 2023

➤ Jonathan Reams commented:

Not sure where this ticket stands right now. [~[email protected]], will this be done as part of your multi-process sync work? Is there other work outside that project that needs to be done here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants