Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a column participant to room_memberships table #18068

Open
wants to merge 14 commits into
base: develop
Choose a base branch
from

Conversation

H-Shay
Copy link
Contributor

@H-Shay H-Shay commented Jan 7, 2025

Adds a column participant to room_memberships table to track whether a user has participated in a room - participation is defined as having sent a m.room.message or m.room.encrypted event into the room.

Related to #18040, the approach there to determine room participation was deemed too inefficient and adding this column was the recommend remedy.

@H-Shay H-Shay requested a review from a team as a code owner January 7, 2025 20:04
@H-Shay H-Shay changed the title Add a column participant to room_membership table Add a column participant to room_memberships table Jan 7, 2025
@H-Shay
Copy link
Contributor Author

H-Shay commented Jan 7, 2025

Requesting @anoadragon453's feedback on this as he has context.

@H-Shay H-Shay marked this pull request as draft January 7, 2025 20:20
@erikjohnston erikjohnston requested a review from a team January 8, 2025 11:08
@H-Shay
Copy link
Contributor Author

H-Shay commented Jan 13, 2025

Delayed events complement test failure looks like a flake?

@H-Shay H-Shay requested a review from anoadragon453 January 13, 2025 04:58
@H-Shay
Copy link
Contributor Author

H-Shay commented Jan 21, 2025

@anoadragon453 just wondering if I could get your input on this to ensure it aligns with what we discussed in #18040?

Copy link
Member

@anoadragon453 anoadragon453 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@H-Shay apologies for the slow turn-around on this one. I've been swamped with other work :(

I've made some large suggestions below, I'm afraid, but overall this is heading in a good direction!

txn: LoggingTransaction, last_room_id: str
) -> Optional[str]:
sql = """
SELECT room_id from room_memberships WHERE room_id > ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that room_memberships is a table that is always be appended to, and thus always changing under you. It is ordered by its event_stream_ordering column. So, the only way to traverse it while the system is running, without leaving gaps, is to iterate using the event_stream_ordering column.

Note: room_memberships only has rows deleted from it when a room is purged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we have:

  • a query to pull out a room ID
  • a query that pulls out all users that have ever been joined to that room
  • a query per-user per-room that updates all previous entries that match that user/room combination

I think we can instead do this in one query that processes a batch of room_membership rows all at once. Instead of saving the current room_id for the batch job, start with the currently max event_stream_ordering row and work backwards in batches of say 1000.

Constrain your query to the current event_stream_ordering - BATCH_SIZE. Then within that, UPDATE all rows based on data in the events table. Then save the new event_stream_ordering - BATCH_SIZE to your background job.

Now the table can continue to grow without things changing from underneath you, as historical data is only (rarely) deleted.

synapse/storage/databases/main/roommember.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/roommember.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/roommember.py Outdated Show resolved Hide resolved
synapse/storage/databases/main/roommember.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants