Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set PoV size limit to 10 Mb #5884

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

s0me0ne-unkn0wn
Copy link
Contributor

Okay, enough theory, let's kick it off. Closes #5334.

What are the implications from my point of view right now:

After this PR is merged

All the new chains (including test ones) started from genesis will support 10 Mb PoVs.

If some parachain updates to the SDK version containing this value and then upgrades its runtime, the runtime will allow for 10 Mb PoVs while the relay chain still has a 5 Mb limit. I don't think it's a problem because the collator building a block will be limited by the PoV size limit stored in the persistent validation data (or even half of that limit if the full-pov-size feature is not enabled). That may lead to overshooting some per-dispatch-class limits but not the absolute PoV size limits, which is safe. Still, a malicious collator can build a block that the relay chain will reject. I'm not 100% sure if we need to take care of such a thing as a "malicious collator". If that's a concern, the possibility of temporarily using another constant for parachains is discussible.

After this change is propagated to the fellowship runtimes

After a new relay chain runtime version with this value is released and the runtime upgrade is performed, it will theoretically allow for a 10 Mb PoV size, but the on-chain limit will still remain at 5 Mb.

After governance changes the max_pov_size config value

Everything supports 10 Mb PoVs now.

@acatangiu please comment on implications for bridges, I see it's used in bridges code, but not sure if any special treatment is needed.

@eskimor
Copy link
Member

eskimor commented Oct 1, 2024

After a new relay chain runtime version with this value is released and the runtime upgrade is performed, it will theoretically allow for a 10 Mb PoV size, but the on-chain limit will still remain at 5 Mb.

Not only the chain also candidate validation will reject it.

For the first part: Why don't we already read the initial value from the relay chain config?

@s0me0ne-unkn0wn
Copy link
Contributor Author

Why don't we already read the initial value from the relay chain config?

You mean in genesis? I suppose because we treat the genesis state as a static state, and we don't want to make requests to the relay chain when building a chainspec?

@@ -426,7 +426,7 @@ pub const MAX_HEAD_DATA_SIZE: u32 = 1 * 1024 * 1024;
/// * checking updates to this stored runtime configuration do not exceed this limit
/// * when detecting a PoV decompression bomb in the client
// NOTE: This value is used in the runtime so be careful when changing it.
pub const MAX_POV_SIZE: u32 = 5 * 1024 * 1024;
pub const MAX_POV_SIZE: u32 = 10 * 1024 * 1024;
Copy link
Contributor

@alexggh alexggh Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we validate this works e2e, like have a parachain that generates 10MiB PoV and confirm parachain blocks work at 6s rate with all blocks included on chain ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of makes sense to add a glutton test with these 10MB PoVs. Later we should do a more scaled up test on Versi.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't validated anything yet (beyond running local zombienets to ensure 10 Mb PoVs work in principle). The purpose of this PR is to gather opinions and concerns, we're not merging it before we do a lot of tests anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beyond running local zombienets to ensure 10 Mb PoVs work in principle

That's what I was asking for :D.

Copy link
Contributor

@sandreim sandreim Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this PR is to gather opinions and concerns, we're not merging it before we do a lot of tests anyway.

But the description of the PRs and my own reasoning tells me that we can merge it. PoV should still be limited by the relay runtime config value at 5MB. To have 10MB in practice you need governance to bump the chain limit and also parachain runtimes to be adjusted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, okay, I'll put that another way. By this PR, I argue that the given changeset is enough to enable 10 Mb PoVs, and I expect others to argue why it is not. @alexggh's point is that we need more testing before applying the changes, and that's fair. By now, I've finished the local tests with gluttons and will do some stress tests. After that, I'm ready to go to Versi, but I'd like to make it as close to real-life as possible on Versi, that is, to start with a regular network with 5 Mb PoVs and then upgrade it to 10 Mb step by step.

But if the lack of testing is the only argument against merging it in this form, it's beautiful ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexggh's point is that we need more testing before applying the changes, and that's fair.

I was simply asking, because I was curious what tests we've done on this end, the fact that you ran it locally with 10MiB is good enough for me.

I understand that the relay-chain configuration is guarding us against using more than 5MiB, and it is probably safe to merge it without positive evidence that the new maximum is working all the way trough our stack, I was simply trying to avoid the situation where we raise this constant because we know the relay-chain configuration is at 5MiB and it is risk free now, but then 6 months later someone else comes without this context and raises the relay-chain configuration to 10MiB because MAX_POV_SIZE is already 10MiB and it must be safe :D.

@acatangiu
Copy link
Contributor

@acatangiu please comment on implications for bridges, I see it's used in bridges code, but not sure if any special treatment is needed.

I am not aware of this being used in bridges/bridging. I also did a code search and didn't find usage in any bridge-related code. So feel free to go ahead from bridges point of view.

@bkontur
Copy link
Contributor

bkontur commented Nov 28, 2024

@s0me0ne-unkn0wn those failed tests, you pointed me to, are just some sanity checks for tx priority,
and those calculations uses: pallet_transaction_payment::ChargeTransactionPayment::<Runtime>::get_priority which internally uses T::BlockWeights::get().max_block which comes from RuntimeBlockWeights->MAXIMUM_BLOCK_WEIGHT->MAX_POV_SIZE.

So, if you doubled that MAX_POV_SIZE, you need to also doubled that constant here.

The PriorityBoostPerMessage value (182044444444444) must be fixed to: 364088888888888

Are any other tests failing or just these bridges sanity checks? Shouldn't we have more test to cover MAX_POV_SIZE changes (not sure if relevant)?

@s0me0ne-unkn0wn
Copy link
Contributor Author

Are any other tests failing or just these bridges sanity checks?

Hard to say as our tests tend to fail randomly sometimes 🙃 We'll see after I fix this one.

Shouldn't we have more test to cover MAX_POV_SIZE changes (not sure if relevant)?

Not sure indeed, as we're not changing the PoV size itself, I mean, after these changes are applied, the relay chain will still instruct parachains to build 5 Mb PoVs max, until it is changed in the parachain host configuration through the governance action.

@paritytech-workflow-stopper
Copy link

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/12211814391
Failed job name: test-linux-stable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Increase max_pov_size to 10MB
7 participants