Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[indexer-alt] add prune impls for each pipeline #20635

Open
wants to merge 1 commit into
base: indexer-alt-cp-mapping-for-pruning
Choose a base branch
from

Conversation

emmazzz
Copy link
Contributor

@emmazzz emmazzz commented Dec 14, 2024

Description

Added prune implementations for pipeline inside indexer alt schema, built upon Will's cp mapping PR.

Test plan

Will add tests.


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

Copy link

vercel bot commented Dec 14, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 26, 2024 6:47pm
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Dec 26, 2024 6:47pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Dec 26, 2024 6:47pm
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Dec 26, 2024 6:47pm

@emmazzz emmazzz temporarily deployed to sui-typescript-aws-kms-test-env December 14, 2024 04:11 — with GitHub Actions Inactive
@emmazzz emmazzz marked this pull request as ready for review December 14, 2024 08:01
@emmazzz emmazzz temporarily deployed to sui-typescript-aws-kms-test-env December 14, 2024 08:01 — with GitHub Actions Inactive
async fn prune(range: PrunableRange, conn: &mut db::Connection<'_>) -> Result<usize> {
let (from, to) = range.containing_epochs();
let filter = kv_epoch_starts::table
.filter(kv_epoch_starts::epoch.between(from as i64, to as i64 - 1));
Copy link
Contributor Author

@emmazzz emmazzz Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not fully convinced myself that pruning epoch grained tables will just work like this, even when our watermark and retention are checkpoint grained. I will add a test tomorrow to make sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed -- and I would say that if this implementation doesn't behave correctly, it would be good to change the helper function in PrunableRange so that all prune impls can follow the same pattern (regardless of whether they are epoch-, checkpoint-, or transaction-grained), rather than change the epoch-grained impls to have a slightly different structure.

@emmazzz emmazzz requested review from wlmyng and amnn December 14, 2024 08:04
Copy link
Member

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks @emmazzz. There are some suggested changes on @wlmyng's PR that may affect this one (how the epoch helpers are implemented might introduce an off-by-one difference in this PR, and there is also a suggestion about changing the PrunableRange interface to move the responsibility to translate bounds into the individual prune impls).

I'll leave it with you and @wlmyng to coordinate those changes and then land both PRs!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we can now delete by tx sequence number, should we get rid of the index on cp_sequence_number and write this prune impl based on the tx_interval?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually have an index on cp_sequence_number? Seems like it is a field on the table, but no corresponding index. We might as well proceed with implementation based on tx_interval

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I must've missed something ... it looks like kv_transactions has only the cp_sequence_number field. So we'd need to backfill that. And since the primary key is on tx_digest, we'd need to introduce an index on tx_sequence_number

async fn prune(range: PrunableRange, conn: &mut db::Connection<'_>) -> Result<usize> {
let (from, to) = range.containing_epochs();
let filter = kv_epoch_starts::table
.filter(kv_epoch_starts::epoch.between(from as i64, to as i64 - 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed -- and I would say that if this implementation doesn't behave correctly, it would be good to change the helper function in PrunableRange so that all prune impls can follow the same pattern (regardless of whether they are epoch-, checkpoint-, or transaction-grained), rather than change the epoch-grained impls to have a slightly different structure.

Copy link
Member

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think again this PR would change because of comments on the earlier PR, but I think those changes should be mechanical, so accepting to unblock!

Comment on lines 64 to 71
let range_mapping = PrunableRange::get_range(conn, from, to).await?;
let (from_tx, to_tx) = range_mapping.tx_interval();
let filter = ev_emit_mod::table
.filter(ev_emit_mod::tx_sequence_number.between(from_tx as i64, to_tx as i64 - 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion in the earlier PR would translate to something like this on this side:

Suggested change
let range_mapping = PrunableRange::get_range(conn, from, to).await?;
let (from_tx, to_tx) = range_mapping.tx_interval();
let filter = ev_emit_mod::table
.filter(ev_emit_mod::tx_sequence_number.between(from_tx as i64, to_tx as i64 - 1));
let Range { start, end } = tx_interval(conn, from..to);
let filter = ev_emit_mod::table
.filter(ev_emit_mod::tx_sequence_number.between(start as i64, end as i64 - 1));

@wlmyng wlmyng force-pushed the indexer-alt-cp-mapping-for-pruning branch 3 times, most recently from 92c75e5 to 87504e7 Compare December 26, 2024 17:11
@wlmyng wlmyng force-pushed the indexer-alt-prune-impls branch from b07a86e to 4726c70 Compare December 26, 2024 17:12
@wlmyng wlmyng temporarily deployed to sui-typescript-aws-kms-test-env December 26, 2024 17:12 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants