-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[indexer-alt] add prune impls for each pipeline #20635
base: indexer-alt-cp-mapping-for-pruning
Are you sure you want to change the base?
[indexer-alt] add prune impls for each pipeline #20635
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Skipped Deployments
|
async fn prune(range: PrunableRange, conn: &mut db::Connection<'_>) -> Result<usize> { | ||
let (from, to) = range.containing_epochs(); | ||
let filter = kv_epoch_starts::table | ||
.filter(kv_epoch_starts::epoch.between(from as i64, to as i64 - 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not fully convinced myself that pruning epoch grained tables will just work like this, even when our watermark and retention are checkpoint grained. I will add a test tomorrow to make sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed -- and I would say that if this implementation doesn't behave correctly, it would be good to change the helper function in PrunableRange
so that all prune impls can follow the same pattern (regardless of whether they are epoch-, checkpoint-, or transaction-grained), rather than change the epoch-grained impls to have a slightly different structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks @emmazzz. There are some suggested changes on @wlmyng's PR that may affect this one (how the epoch helpers are implemented might introduce an off-by-one difference in this PR, and there is also a suggestion about changing the PrunableRange
interface to move the responsibility to translate bounds into the individual prune
impls).
I'll leave it with you and @wlmyng to coordinate those changes and then land both PRs!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given we can now delete by tx sequence number, should we get rid of the index on cp_sequence_number
and write this prune
impl based on the tx_interval
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually have an index on cp_sequence_number
? Seems like it is a field on the table, but no corresponding index. We might as well proceed with implementation based on tx_interval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I must've missed something ... it looks like kv_transactions
has only the cp_sequence_number
field. So we'd need to backfill that. And since the primary key is on tx_digest
, we'd need to introduce an index on tx_sequence_number
async fn prune(range: PrunableRange, conn: &mut db::Connection<'_>) -> Result<usize> { | ||
let (from, to) = range.containing_epochs(); | ||
let filter = kv_epoch_starts::table | ||
.filter(kv_epoch_starts::epoch.between(from as i64, to as i64 - 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed -- and I would say that if this implementation doesn't behave correctly, it would be good to change the helper function in PrunableRange
so that all prune impls can follow the same pattern (regardless of whether they are epoch-, checkpoint-, or transaction-grained), rather than change the epoch-grained impls to have a slightly different structure.
160d0a8
to
6cabf8a
Compare
2ce69cb
to
438479b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think again this PR would change because of comments on the earlier PR, but I think those changes should be mechanical, so accepting to unblock!
let range_mapping = PrunableRange::get_range(conn, from, to).await?; | ||
let (from_tx, to_tx) = range_mapping.tx_interval(); | ||
let filter = ev_emit_mod::table | ||
.filter(ev_emit_mod::tx_sequence_number.between(from_tx as i64, to_tx as i64 - 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion in the earlier PR would translate to something like this on this side:
let range_mapping = PrunableRange::get_range(conn, from, to).await?; | |
let (from_tx, to_tx) = range_mapping.tx_interval(); | |
let filter = ev_emit_mod::table | |
.filter(ev_emit_mod::tx_sequence_number.between(from_tx as i64, to_tx as i64 - 1)); | |
let Range { start, end } = tx_interval(conn, from..to); | |
let filter = ev_emit_mod::table | |
.filter(ev_emit_mod::tx_sequence_number.between(start as i64, end as i64 - 1)); |
92c75e5
to
87504e7
Compare
b07a86e
to
4726c70
Compare
4726c70
to
e8252bf
Compare
Description
Added
prune
implementations for pipeline inside indexer alt schema, built upon Will's cp mapping PR.Test plan
Will add tests.
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.