feat: Provider insolvency #278

snowmead · 2024-11-29T00:41:59Z

Mark providers as insolvent and slash / operations not allowed as insolvent provider

Process providers in the on_idle hook to mark them as insolvent and slash any remaining deposit they may have.

Many extrinsics in the pallet_file_system, pallet_storage_providers and pallet_payment_streams now fail if the provider is insolvent.

Charging users only for the period when the provider is solvent

Charging users via pallet_payment_streams now takes into account the block at which the provider was insolvent. For dynamic and fixed rate payment streams, the user will only be charged for the time that the provider was solvent.

Deleting insolvent provider is done manually

A new extrinsic delete_provider allows anyone to delete a provider if certain conditions are met:

The provider must be insolvent
The provider must not have any active payment streams

A new runtime API can_delete_provider is exposed to have a central piece of logic that handles these conditions and gives opportunity for offchain processes to automate provider deletion.

Storage request with custom replication target

File system pallet's issue_storage_request extrinsic now sets the replication_target to a new parameter called DefaultReplicationTarget supplied in the Config. This is simply to decouple the maximum replication target we accept for a storage request to be fulfilled and the default we set if it is not supplied by the caller.

Client and runtime side `submit_proof` bug fixes

There was a bug in the submit_proof extrinsic where the checkpoint challenge period was set to be the greatest challenge period. This led to issues where the checkpoint challenge a BSP would submit a proof for, would potentially be changed mid way via the on_poll hook and therefore miss the checkpoint challenge. This also inadvertently introduced an potential attack vector where BSPs can intentionally wait a certain number of blocks while still remaining in their challenge tick tolerance period, for the checkpoint challenges to be outdated, i.e. the checkpoint challenge period has been reached and is no longer awaited to be responded to. This would allow BSPs to skip unwanted checkpoint challenges.

To solve this issue, we now have the checkpoint challenge being set to greatest_challenge_period + challenge_tick_tolerance + 1. This allows no possibility for a checkpoint challenge to be skipped.

The runtime and client side conditions for including checkpoint challenges has been crosschecked and fixed to be identical in nature. There was a discrepency in the client side check to include checkpoint challenges when the runtime was not expecting any to be included.

BSP volunteering task: Error handling and optimizations

The BSP task that handles volunteering now skips the file if the storage request is no longer open to volunteering and if the volunteer extrinsic fails, the file is deleted from file storage and we unregister the file. This fixes a bug where if the another storage request were to be issued after failing the first volunteer check, the second volunteer would fail due to the file already being in the file storage with the error FileAlreadyExists.

TODOs

A lot of the TODOs below were added due to the storage-delete integration test which raised a lot of bugs or unimplemented logic that didn't handle non-happy path extrinsics being executed. Like for example, revoking a storage request but then a BSP requesting to stop storing the file aftewards and having the runtime automatically apply the mutation in a checkpoint challenge when the BSP submits a proof, and the bspConfirmStopStoring extrinsic failing afterwards due to local and runtime root mismatch.

Modify wait_for_tick so we can pass an arbitrary closure like the one below, which would execute it and determine if it should continue waiting for the tick to be reached.

This to close the task earlier if possible in the event when a storage request has been fulfilled or revoked mid-way in the waiting process.

storage-hub/node/src/tasks/bsp_upload_file.rs

Lines 707 to 714 in 9dce670

    
           // TODO: Have this dynamically called at every tick in `wait_for_tick` to exit early without waiting until `earliest_volunteer_tick` in the event the storage request 
        
           // TODO: is closed mid-way through the process. 
        
           let can_volulnteer = self 
        
               .storage_hub_handler 
        
               .blockchain 
        
               .is_storage_request_open_to_volunteers(file_key.into()) 
        
               .await 
        
               .map_err(|e| anyhow!("Failed to query file can volunteer: {:?}", e))?;

Checking if the storage request is still open to volunteering does not work in the event when the BSP is catching up to the latest block. For example:

BSP node is down at block 1
chain advances to block 3
storage request is opened
chain advances to block 5
storage request is revoked
BSP node is up
BSP node catches up to block 3
BSP node volunteers for storage request and fails the extrinsic

This is because the BSP is unaware of the revoked storage request at block 5 since he hasn't processed it yet. When we call is_storage_request_open_to_volunteers blockchain service command, we use the best block hash which is block 3 in this case at the time of volunteering, when we should query the state of the latest block in the chain.

storage-hub/node/src/tasks/bsp_upload_file.rs

Lines 716 to 725 in 9dce670

    
           // Skip volunteering if the storage request is no longer open to volunteers. 
        
           // TODO: Handle the case where were catching up to the latest block. We probably either want to skip volunteering or wait until 
        
           // TODO: we catch up to the latest block and if the storage request is still open to volunteers, volunteer then. 
        
           if !can_volulnteer { 
        
               let err_msg = "Storage request is no longer open to volunteers. Skipping volunteering."; 
        
               warn!( 
        
                   target: LOG_TARGET, "{}", err_msg 
        
               ); 
        
               return Err(anyhow::anyhow!(err_msg)); 
        
           }

For some reason we are revoking a storage request before the first BSP tries to confirm stop storing, and this is the reason why the test is failing at the end.

storage-hub/test/suites/integration/bsp/storage-delete.test.ts

Lines 53 to 54 in 9dce670

    
           // Revoke the storage request otherwise the new storage request event is not being triggered 
        
           await userApi.sealBlock(userApi.tx.fileSystem.revokeStorageRequest(fileKey), shUser);

Also right now the BSP that requested to stop storing for a file is re-volunteering for it automatically through the tasks. I propose we add the following listed rpc methods that allow BSPs to skip storage requests based on file key/buckey/user blacklists.

storage-hub/test/suites/integration/bsp/storage-delete.test.ts

Lines 63 to 91 in 9dce670

    
           // TODO: create an RPC to automatically execute everything below 
        
           // TODO: everything below should be removed and replaced with other testing logic 
        
           const inclusionForestProof = await bspApi.rpc.storagehubclient.generateForestProof(null, [ 
        
             fileKey 
        
           ]); 
        
           await userApi.sealBlock( 
        
             bspApi.tx.fileSystem.bspRequestStopStoring( 
        
               fileKey, 
        
               bucketId, 
        
               location, 
        
               userApi.shConsts.NODE_INFOS.user.AddressId, 
        
               fingerprint, 
        
               fileSize, 
        
               false, 
        
               inclusionForestProof.toString() 
        
             ), 
        
             bspKey 
        
           ); 
        
           await userApi.assert.eventPresent("fileSystem", "BspRequestedToStopStoring"); 
        
           // When requested to stop storing a file we should also receive an event new storage request 
        
           // to replace the bsp leaving 
        
           // TODO: add blacklisting file keys to avoid volunteering for files that the BSP requested to stop storing 
        
           // TODO: add rpc to remove file key from blacklisted file keys 
        
           // TODO: add rpc to add user to blacklisted users to skip any storage request from them 
        
           // TODO: add rpc to add bucket to blacklisted buckets to skip any storage request from them 
        
           await userApi.assert.eventPresent("fileSystem", "NewStorageRequest");

This is broken due to the reasons mentioned above and therefore this section below is commented out for now until we resolve it.

storage-hub/test/suites/integration/bsp/storage-delete.test.ts

Lines 104 to 118 in 9dce670

    
           // TODO: commented out since this extrinsic will inevitably fail. The reason being is that the revoke storage request executed above 
        
           // TODO: created a checkpoint challenge remove mutation which the bsp responded to and removed the file key from the forest before executing this extrinsic 
        
           // TODO: we should remove this entirely and implement a task or rpc to handle automatic stop storing 
        
           // await userApi.sealBlock( 
        
           //   userApi.tx.fileSystem.bspConfirmStopStoring( 
        
           //     fileKey, 
        
           //     inclusionForestProof 
        
           //   ), 
        
           //   bspKey 
        
           // ); 
        
           // await userApi.assert.eventPresent( 
        
           //   "fileSystem", 
        
           //   "BspConfirmStoppedStoring" 
        
           // );

…ested)

…irst and second BSP

…can_delete_provider runtime api, delete provider from indexer

Copilot reviewed 44 out of 59 changed files in this pull request and generated no suggestions.

Files not reviewed (15)

api-augment/src/interfaces/lookup.ts: File too large
api-augment/src/interfaces/types-lookup.ts: File too large
pallets/providers/runtime-api/src/lib.rs: Evaluated as low risk
api-augment/src/interfaces/registry.ts: Evaluated as low risk
pallets/payment-streams/src/lib.rs: Evaluated as low risk
api-augment/src/interfaces/augment-api-consts.ts: Evaluated as low risk
pallets/proofs-dealer/src/mock.rs: Evaluated as low risk
pallets/payment-streams/src/mock.rs: Evaluated as low risk
pallets/bucket-nfts/src/mock.rs: Evaluated as low risk
pallets/file-system/src/mock.rs: Evaluated as low risk
client/indexer-service/src/handler.rs: Evaluated as low risk
api-augment/src/interfaces/augment-api-tx.ts: Evaluated as low risk
pallets/providers/Cargo.toml: Evaluated as low risk
api-augment/src/interfaces/augment-api-errors.ts: Evaluated as low risk
api-augment/src/interfaces/augment-api-events.ts: Evaluated as low risk

Comments skipped due to low confidence (2)

pallets/payment-streams/src/utils.rs:75

[nitpick] The error message Error::<T>::ProviderInsolvent should be reviewed to ensure it is clear and helpful.

if <T::ProvidersPallet as ReadProvidersInterface>::is_provider_insolvent(*provider_id) {

pallets/file-system/src/lib.rs:1356

The assertion message should use T::DefaultReplicationTarget::zero() instead of T::ReplicationTargetType::zero() for consistency.

assert!(default_replication_target > T::ReplicationTargetType::zero(),

snowmead · 2024-12-03T16:05:54Z

Should we add a check to the ReadChallengeableProvidersInterface::is_provider implementation that returns false if the provider is insolvent so that the proofs dealer pallet does not accept any proofs from them?

I think we should.

…fix tests

…mit proof now accepts checkpoint challenges if last_checkpoint_tick is equal to challenges tick, CheckpointChallengePeriod now must be greater than greatest provider challenge period + tolerance + 1 to avoid attack, new best block info command

…for file, add TODOs for storage-delete integration test and comment out broken test

snowmead and others added 30 commits November 12, 2024 13:32

slash extrinsic (tests + automatic top ups) and top_up extrinsic (unt…

8dedeb5

…ested)

slash extrinsic (tests + automatic top ups) and top_up extrinsic (unt…

2238f0b

…ested)

Merge branch 'main' into feat/slash-top-up

5230f56

fix test: assert event present SlashedAndAwaitingTopUp

14d2873

Merge branch 'main' into feat/slash-top-up

3a02a15

typegen

20a3cd7

fix submit proof test

5fa2dc7

Merge branch 'main' into feat/slash-top-up

3435cbc

refactor slashing events, fix test

8a08314

Merge branch 'main' into feat/slash-top-up

cc7cd6a

refactor do_slash

a52334c

update docs, typegen

1ca232b

do_slash overhauled

20e9d3b

top up unit tests, use proper errors

1dcb028

Merge branch 'main' into feat/slash-top-up

f2a94d9

update do_slash to slash the held deposit only, update tests

92e70a8

amend docs

2d8772b

Merge branch 'main' into feat/slash-top-up

bec9069

early return slash if the used_capacity is below the minimum capacity

2aee8d7

Merge branch 'main' into feat/slash-top-up

da110b8

(wip) provider top up expiration item

ebb991f

Merge branch 'main' into feat/slash-top-up

1cdc3bd

Merge branch 'main' into feat/mark-provider-as-insolvent

8327e4d

amend review

f795227

Merge branch 'main' into feat/slash-top-up

d49aacc

Merge branch 'feat/slash-top-up' into feat/mark-provider-as-insolvent

5fa2d4e

fix test

0f19117

revert: ⏪ Revert changes in submit-proofs integration test suite

308cc45

fix: ✅ Consider scenario where next challenge block is the same for f…

a31f19e

…irst and second BSP

Merge branch 'feat/slash-top-up' into feat/mark-provider-as-insolvent

1173fd4

snowmead added 6 commits November 29, 2024 14:27

Merge branch 'main' into feat/mark-provider-as-insolvent

df78a73

new has_active_payment_streams method, delete provider extrinsic and …

48cbf33

…can_delete_provider runtime api, delete provider from indexer

max and default replication targets for storage requests, integrity test

d3f9852

revert removing BenchmarkHelpers

147207b

Merge branch 'main' into feat/mark-provider-as-insolvent

ab74078

use runtime api for provider top up ttl

19144aa

snowmead marked this pull request as ready for review December 2, 2024 19:30

snowmead added 6 commits December 2, 2024 14:36

add null replicationTarget to issueStorageRequest calls in tests

d924bcb

fix import

128bed4

sync with main

87e0600

toString replacing toNumber

2cd77a3

fix tests with replication target

2067b46

fix replication target int tests

fe30747

snowmead requested review from ffarall, TDemeco and Copilot December 3, 2024 15:59

Copilot AI reviewed Dec 3, 2024

View reviewed changes

snowmead added 12 commits December 3, 2024 11:31

add DefaultReplicationTarget to runtime params

950cd90

fix set global param tests

83a2f3b

use weight meter and return consumed weight from on idle executions, …

6e5db25

…fix tests

fix msp storage request test setup

7a2ae27

pass user to create and bucket and storage request

af87238

fix debt collection test

e4906e5

Merge branch 'main' into feat/mark-provider-as-insolvent

bddadf9

is_storage_request_open_to_volunteers runtime api, skip volunteering …

fb088b6

…for file, add TODOs for storage-delete integration test and comment out broken test

temporary fix non determinstic payment stream check

9dce670

remove unused variable from integration test

ba129d7

remove redundant test

b6bd07b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Provider insolvency #278

feat: Provider insolvency #278

snowmead commented Nov 29, 2024 •

edited

Loading

snowmead commented Dec 3, 2024 •

edited

Loading

	// TODO: Have this dynamically called at every tick in `wait_for_tick` to exit early without waiting until `earliest_volunteer_tick` in the event the storage request
	// TODO: is closed mid-way through the process.
	let can_volulnteer = self
	.storage_hub_handler
	.blockchain
	.is_storage_request_open_to_volunteers(file_key.into())
	.await
	.map_err(\|e\| anyhow!("Failed to query file can volunteer: {:?}", e))?;

	// Skip volunteering if the storage request is no longer open to volunteers.
	// TODO: Handle the case where were catching up to the latest block. We probably either want to skip volunteering or wait until
	// TODO: we catch up to the latest block and if the storage request is still open to volunteers, volunteer then.
	if !can_volulnteer {
	let err_msg = "Storage request is no longer open to volunteers. Skipping volunteering.";
	warn!(
	target: LOG_TARGET, "{}", err_msg
	);
	return Err(anyhow::anyhow!(err_msg));
	}

	// Revoke the storage request otherwise the new storage request event is not being triggered
	await userApi.sealBlock(userApi.tx.fileSystem.revokeStorageRequest(fileKey), shUser);

	// TODO: create an RPC to automatically execute everything below
	// TODO: everything below should be removed and replaced with other testing logic

	const inclusionForestProof = await bspApi.rpc.storagehubclient.generateForestProof(null, [
	fileKey
	]);
	await userApi.sealBlock(
	bspApi.tx.fileSystem.bspRequestStopStoring(
	fileKey,
	bucketId,
	location,
	userApi.shConsts.NODE_INFOS.user.AddressId,
	fingerprint,
	fileSize,
	false,
	inclusionForestProof.toString()
	),
	bspKey
	);

	await userApi.assert.eventPresent("fileSystem", "BspRequestedToStopStoring");

	// When requested to stop storing a file we should also receive an event new storage request
	// to replace the bsp leaving
	// TODO: add blacklisting file keys to avoid volunteering for files that the BSP requested to stop storing
	// TODO: add rpc to remove file key from blacklisted file keys
	// TODO: add rpc to add user to blacklisted users to skip any storage request from them
	// TODO: add rpc to add bucket to blacklisted buckets to skip any storage request from them
	await userApi.assert.eventPresent("fileSystem", "NewStorageRequest");

	// TODO: commented out since this extrinsic will inevitably fail. The reason being is that the revoke storage request executed above
	// TODO: created a checkpoint challenge remove mutation which the bsp responded to and removed the file key from the forest before executing this extrinsic
	// TODO: we should remove this entirely and implement a task or rpc to handle automatic stop storing
	// await userApi.sealBlock(
	// userApi.tx.fileSystem.bspConfirmStopStoring(
	// fileKey,
	// inclusionForestProof
	// ),
	// bspKey
	// );

	// await userApi.assert.eventPresent(
	// "fileSystem",
	// "BspConfirmStoppedStoring"
	// );

feat: Provider insolvency #278

Are you sure you want to change the base?

feat: Provider insolvency #278

Conversation

snowmead commented Nov 29, 2024 • edited Loading

Mark providers as insolvent and slash / operations not allowed as insolvent provider

Charging users only for the period when the provider is solvent

Deleting insolvent provider is done manually

Storage request with custom replication target

Client and runtime side submit_proof bug fixes

BSP volunteering task: Error handling and optimizations

TODOs

Choose a reason for hiding this comment

snowmead commented Dec 3, 2024 • edited Loading

snowmead commented Nov 29, 2024 •

edited

Loading

Client and runtime side `submit_proof` bug fixes

snowmead commented Dec 3, 2024 •

edited

Loading