Feat: Pubsub system to update Casts #1

mindreframer · 2024-09-02T13:51:49Z

No description provided.

+ adjust tests

Problem: - It seems to be quite non-trivial to process events that are causally dependent. - We can get sequential IDs, where records with lower IDs can become visible later than records with higher IDs - We can get transaction IDs, that where lower values become visible later - To workaround those issues we would need to write some tricky code and we need to maintain it and write tests for it Solution: - after some contemplation I decided to go for a stupid workaround - we only allow sequential inserts to the EventStore (events) across all Elixir processes - that way the IDs must be only increasing and there should be no interleaving - that way the logic to deal with new events becomes trivial, since we should never see newer events with lower IDs Technical realization: - we use Postgres `pg_try_advisory_lock` and `pg_advisory_unlock` - this allows coordination across all Elixir OS processes of our app, that use the same DB at the same time. - also the locks respect the current Context scope, so that different scopes DO not conflict with each other. - to allow consistent lock unlocking, we need to run the unlocking with the same Postgres DB connection, as the locking - normal Ecto Repo does not enable this functionality - as a workaround we configure SandRepo, that has the sole purpose of requiring and releasing PG locks - event appending to our Event store is only possible through a locked function

Problem: - the sandbox config has some test-specific params, like timing out, cleaning up the conn, etc. - this prevents proper usage Solution: - we just stick to a normal repo with a pool of size=1, that way we always get the same connection - this is only use for locking, so there should be no performance issues

…napshot) to events Problem: - we might need to extend features for our event store - to reduce chatiness (notification per event), we need to store the current transaction id Solution: - we add current PG transaction id + min current LSN to every document via triggers - based on the example here - https://github.com/josevalim/sync/blob/main/priv/repo/migrations/20240806131210_create_publication.exs

Problem: - we somehow need to make sure, that our app notices new inserted events - a local Pubsub would miss out on inserts from other app instances (iex shells / etc) - also we do not want to overload our system by sending large messages (many events in a single message) Solution: - we add a signals tables (somewhat inspired by Debezium - https://debezium.io/blog/2023/06/27/Debezium-signaling-and-notifications/) - after each transaction on events table we also insert a small row into signals table with the current scope_uuid - it has configured triggers, that execute a `pg_notify` function with small message - in our app we start PGNotifyListener, whose purpose is to re-broadcast those pg_notify messages to the local PubSub system - that way our app still deals with Phx PubSub, only those messages originate from DB and work across OS process boundaries without a distributed Erlang cluster

… inspected later

…out work duplication Problem: - we need to fetch the same data from the DB at the same time from all the cast runner processes - this could easily overload the DB due to the "thundering herd" effect Solution: - implement a caching layer for the event store - this caching layer holds all in-flight requests except the first one (for the value generation) and gives them the cached value after it was generated - current module is a draft for the concept, with fake work

…guments) tuple Problem: - we need a flexible way to tell the cache how to fill cache values - hardcoding logic feels inflexible Solution: - by using MFA (module / function / arguments) this cache becomes generic - this can be used for any kind of caching and is extremely versatile

- now it became a generic caching layer, without other dependencies, useful for ANYTHING. ;)

Problem: - we would like to prevent our cache from growing indefinitely - for this we need to remember, when a key was used Solution: - store the timestamp for each key

…e reads cachable

lib/event_store/append_to_stream.ex

Problem: - after getting the "poke" from the signals table, we currently know the transaction id and scope / stream uuids, but we still have issue an extra query, to figure out how many new events we need to fetch Solution: - the count of new events + max_id can be easily derived from the signals table, so we add new columns and send those values in the append_to_stream function.

mindreframer · 2024-09-08T20:17:21Z

There are already quite some changes in this PR, and there is enough noise through renames and similar. The foundation is good, next changes will happen in a different PR.

mindreframer added 14 commits September 2, 2024 15:07

Chore: make test case async

d6a828f

Feat: configure Essig.PubSub

90de59b

Chore: remove unused alias

a951781

Chore: sample event structs for local testing

1e9c877

Chore: broadcast events after insertion + simple subscriber

da72f0a

Feat: also sets metadata when starting casts

b79439d

+ adjust tests

Chore: Drop Ecto from UUID generator module name

eb0bb36

Chore: add gen_state_machine (wrapper for :gen_statem)

25efe3a

Feat: fetch DB row for casts on init

fd9e9b9

Feat: update db row for casts for each update

71cbc57

Chore: remove unused test_name from setup callback

4741e8b

Chore: add some useful snippets

1182bc7

mindreframer force-pushed the feat/event_broadcaster branch from f94c52c to 214ffda Compare September 3, 2024 22:36

mindreframer added 15 commits September 4, 2024 11:12

Chore: wording

88f08f4

Feat: wrapper for pubsub

9e97a87

Chore: A small checker module

f6bc054

Chore: move sql comments to be within function bodies, so they can be…

8c05558

… inspected later

Chore: move uuid7 file to lib/ folder

40c486a

Chore: adjust naming in tests

bf5d375

Chore: rename _xid to txid

d228e6d

Chore: remove debug statement from migration

7370ce3

Chore: Checker allows appending to same stream

0ace7e6

Chore: fix compilation warning

b45fd28

Feat: also store stream_uuid on essig_signals table

c5fb31a

Chore: fix module name for EventStore.BaseQuery

d85687a

mindreframer added 10 commits September 5, 2024 23:05

Feat: Essig.EventStore.Cache -> Essig.Cache

cd79d41

- now it became a generic caching layer, without other dependencies, useful for ANYTHING. ;)

Feat: add option to remove entry from cache in Essig.Cache

bffd565

Feat: also update last_used timestamps for each accessed key in cache

91226cb

Problem: - we would like to prevent our cache from growing indefinitely - for this we need to remember, when a key was used Solution: - store the timestamp for each key

Feat: prepare support for expiration of Essig.Cache values

d9d608a

Feat: function to handle expired entries in Essig.Cache

6db80f8

Feat: use GenCache for Essig.Cache

ba15379

Chore: keep ecto logs in dev

6197ec4

Feat: add EventStoreRead with explicit current_scope handling, to mak…

0efcd0e

…e reads cachable

Chore: usage examples

489d09f

mindreframer commented Sep 6, 2024

View reviewed changes

lib/event_store/append_to_stream.ex Show resolved Hide resolved

mindreframer merged commit be97aeb into main Sep 8, 2024
1 check passed

mindreframer deleted the feat/event_broadcaster branch September 8, 2024 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Pubsub system to update Casts #1

Feat: Pubsub system to update Casts #1

mindreframer commented Sep 2, 2024

mindreframer commented Sep 8, 2024

Feat: Pubsub system to update Casts #1

Feat: Pubsub system to update Casts #1

Conversation

mindreframer commented Sep 2, 2024

mindreframer commented Sep 8, 2024