diff --git a/storage/aws/README.md b/storage/aws/README.md index 19b4aab6..838eff14 100644 --- a/storage/aws/README.md +++ b/storage/aws/README.md @@ -29,34 +29,28 @@ A table with a single row which is used to keep track of the next assignable seq This holds batches of entries keyed by the sequence number assigned to the first entry in the batch. ### `IntCoord` -TODO: add the new checkpoint updater logic, and update the docstring in aws.go. - -This table is used to coordinate integration of sequenced batches in the `Seq` table. +This table is used to coordinate integration of sequenced batches in the `Seq` table, and keep track of the current tree state. ## Life of a leaf -TODO: add the new checkpoint updater logic. - 1. Leaves are submitted by the binary built using Tessera via a call the storage's `Add` func. -2. [Not implemented yet - Dupe squashing: look for existing `` object, read assigned sequence number if present and return.] -3. The storage library batches these entries up, and, after a configurable period of time has elapsed +1. The storage library batches these entries up, and, after a configurable period of time has elapsed or the batch reaches a configurable size threshold, the batch is written to the `Seq` table which effectively assigns a sequence numbers to the entries using the following algorithm: In a transaction: 1. selects next from `SeqCoord` with for update ← this blocks other FE from writing their pools, but only for a short duration. - 2. Inserts batch of entries into `Seq` with key `SeqCoord.next` - 3. Update `SeqCoord` with `next+=len(batch)` -4. Integrators periodically integrate new sequenced entries into the tree: + 1. Inserts batch of entries into `Seq` with key `SeqCoord.next` + 1. Update `SeqCoord` with `next+=len(batch)` +1. Integrators periodically integrate new sequenced entries into the tree: In a transaction: 1. select `seq` from `IntCoord` with for update ← this blocks other integrators from proceeding. - 2. Select one or more consecutive batches from `Seq` for update, starting at `IntCoord.seq` - 3. Write leaf bundles to S3 using batched entries - 4. Integrate in Merkle tree and write tiles to S3 - 5. Update checkpoint in S3 - 6. Delete consumed batches from `Seq` - 7. Update `IntCoord` with `seq+=num_entries_integrated` - 8. [Not implemented yet - Dupe detection: - 1. Writes out `` containing the leaf's sequence number] + 1. Select one or more consecutive batches from `Seq` for update, starting at `IntCoord.seq` + 1. Write leaf bundles to S3 using batched entries + 1. Integrate in Merkle tree and write tiles to S3 + 1. Update checkpoint in S3 + 1. Delete consumed batches from `Seq` + 1. Update `IntCoord` with `seq+=num_entries_integrated` and the latest `rootHash` +1. Checkpoints representing the latest state of the tree are published at the configured interval. ## Dedup @@ -75,12 +69,12 @@ operational overhead, code complexity, and so was selected. The alpha implementation was tested with entries of size 1KB each, at a write rate of 1500/s. This was done using the smallest possible Aurora instance -availalbe, `db.r5.large`, running `8.0.mysql_aurora.3.05.2`. +available, `db.r5.large`, running `8.0.mysql_aurora.3.05.2`. Aurora (Serverless v2) worked out well, but seems less cost effective than -provisionned Aurora for sustained traffic. For now, we decided not to explore this option further. +provisioned Aurora for sustained traffic. For now, we decided not to explore this option further. -RDS (MySQL) worked out well, but requires more admistrative overhead than +RDS (MySQL) worked out well, but requires more administrative overhead than Aurora. For now, we decided not to explore this option further. DynamoDB worked out to be less cost efficient than Aurora and RDS. It also has diff --git a/storage/gcp/README.md b/storage/gcp/README.md index bee47c1f..07fdb002 100644 --- a/storage/gcp/README.md +++ b/storage/gcp/README.md @@ -34,7 +34,6 @@ This table is used to coordinate integration of sequenced batches in the `Seq` t ## Life of a leaf 1. Leaves are submitted by the binary built using Tessera via a call the storage's `Add` func. -1. Dupe squashing (TODO): look for existing `` object, read assigned sequence number if present and return. 1. The storage library batches these entries up, and, after a configurable period of time has elapsed or the batch reaches a configurable size threshold, the batch is written to the `Seq` table which effectively assigns a sequence numbers to the entries using the following algorithm: @@ -48,11 +47,9 @@ This table is used to coordinate integration of sequenced batches in the `Seq` t 1. Select one or more consecutive batches from `Seq` for update, starting at `IntCoord.seq` 1. Write leaf bundles to GCS using batched entries 1. Integrate in Merkle tree and write tiles to GCS - 1. Update checkpoint in GCS 1. Delete consumed batches from `Seq` - 1. Update `IntCoord` with `seq+=num_entries_integrated` - 1. Dupe detection (TODO): - 1. Writes out `` containing the leaf's sequence number + 1. Update `IntCoord` with `seq+=num_entries_integrated` and the latest `rootHash` +1. Checkpoints representing the latest state of the tree are published at the configured interval. ## Dedup diff --git a/storage/mysql/DESIGN.md b/storage/mysql/DESIGN.md index 60feeff7..6f67e203 100644 --- a/storage/mysql/DESIGN.md +++ b/storage/mysql/DESIGN.md @@ -17,7 +17,11 @@ The DB layout has been designed such that serving any read request is a point lo #### `Checkpoint` -A single row that records the current state of the log. Updated after every sequence + integration. +A single row that records the current published checkpoint. + +#### `TreeState` + +A single row that records the current state of the tree. Updated after every integration. #### `Subtree` @@ -51,12 +55,13 @@ Sequence pool: Sequence & integrate (DB integration starts here): 1. Takes a batch of entries to sequence and integrate -1. Starts a transaction, which first takes a write lock on the checkpoint row to ensure that: +1. Starts a transaction, which first takes a write lock on the `TreeState` row to ensure that: 1. No other processes will be competing with this work. - 1. That the next index to sequence is known (this is the same as the current checkpoint size) + 1. That the next index to sequence is known (this is the same as the current tree size) 1. Update the required TiledLeaves rows -1. Perform an integration operation to update the Merkle tree, updating/adding Subtree rows as needed, and eventually updating the Checkpoint row +1. Perform an integration operation to update the Merkle tree, updating/adding Subtree rows as needed, and eventually updating the `TreeState` row 1. Commit the transaction +1. Checkpoints representing the latest state of the tree are published at the configured interval. ## Costs