Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(starknet_state_sync): create state sync components #2531

Merged
merged 1 commit into from
Dec 12, 2024

Conversation

noamsp-starkware
Copy link
Contributor

No description provided.

@reviewable-StarkWare
Copy link

This change is Reviewable

Copy link
Contributor Author

noamsp-starkware commented Dec 8, 2024

@noamsp-starkware noamsp-starkware force-pushed the noam.s/create_state_sync_component branch from 18eb580 to ba5a628 Compare December 8, 2024 15:37
@noamsp-starkware noamsp-starkware changed the title feat(starknet_state_sync): create state sync component feat(starknet_state_sync): create state sync components Dec 8, 2024
@noamsp-starkware noamsp-starkware force-pushed the noam.s/create_state_sync_component branch from ba5a628 to 890bf5d Compare December 8, 2024 17:37
Copy link

codecov bot commented Dec 8, 2024

Codecov Report

Attention: Patch coverage is 91.07143% with 5 lines in your changes missing coverage. Please review.

Project coverage is 77.56%. Comparing base (e3165c4) to head (3e46906).
Report is 819 commits behind head on main.

Files with missing lines Patch % Lines
crates/papyrus_storage/src/lib.rs 60.00% 0 Missing and 2 partials ⚠️
...tes/starknet_integration_tests/src/config_utils.rs 0.00% 2 Missing ⚠️
crates/starknet_sequencer_node/src/components.rs 85.71% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2531       +/-   ##
===========================================
+ Coverage   40.10%   77.56%   +37.46%     
===========================================
  Files          26      392      +366     
  Lines        1895    42475    +40580     
  Branches     1895    42475    +40580     
===========================================
+ Hits          760    32947    +32187     
- Misses       1100     7237     +6137     
- Partials       35     2291     +2256     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Itay-Tsabary-Starkware Itay-Tsabary-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 13 files reviewed, 5 unresolved discussions (waiting on @noamsp-starkware)


crates/starknet_sequencer_node/src/components.rs line 138 at r1 (raw file):

        mempool_p2p_runner,
        state_sync,
        state_sync_runner,

Are these the formal component names?

Code quote:

        state_sync,
        state_sync_runner,

crates/starknet_integration_tests/src/state_reader.rs line 59 at r1 (raw file):

    pub batcher_storage_handle: TempDir,
    pub state_sync_storage_config: StorageConfig,
    pub state_sync_storage_handle: TempDir,

I thought the state sync storage is supposed to replace the rpc and batcher storage instances. Will they be deleted in following prs?

Code quote:

    pub state_sync_storage_config: StorageConfig,
    pub state_sync_storage_handle: TempDir,

crates/starknet_sequencer_node/src/config/config_test.rs line 79 at r1 (raw file):

    let default_config = SequencerNodeConfig::default();
    assert_matches!(default_config.validate(), Ok(()));
    fs::remove_dir("data").unwrap();

Context for this please 🙏

Code quote:

    // The validate function will fail if the data directory does not exist. we assert the directory
    // does not exist to prevent deleting a non related folder with the same path.
    assert!(!Path::new("data").exists());
    fs::create_dir_all("data").expect("Should make a temporary `data` directory");

    let default_config = SequencerNodeConfig::default();
    assert_matches!(default_config.validate(), Ok(()));
    fs::remove_dir("data").unwrap();

crates/starknet_sequencer_node/src/config/config_test.rs line 119 at r1 (raw file):

    // The validate function will fail if the data directory does not exist so we change the path to
    // point to an existing directory.
    config.state_sync_config.storage_config.db_config.path_prefix = PathBuf::from(".");

Context please.

Code quote:

    // The validate function will fail if the data directory does not exist so we change the path to
    // point to an existing directory.
    config.state_sync_config.storage_config.db_config.path_prefix = PathBuf::from(".");

crates/starknet_integration_tests/src/utils.rs line 237 at r1 (raw file):

pub fn create_state_sync_config(state_sync_storage_config: StorageConfig) -> StateSyncConfig {
    const STATE_SYNC_NETWORK_CONFIG_TCP_PORT_FOR_TESTING: u16 = 12345;

Why not get_available_port?

Code quote:

const STATE_SYNC_NETWORK_CONFIG_TCP_PORT_FOR_TESTING: u16 = 12345;

Copy link
Contributor Author

@noamsp-starkware noamsp-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 13 files reviewed, 5 unresolved discussions (waiting on @Itay-Tsabary-Starkware)


crates/starknet_integration_tests/src/state_reader.rs line 59 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

I thought the state sync storage is supposed to replace the rpc and batcher storage instances. Will they be deleted in following prs?

I'll check this. If so i'll add a TODO/remove them in a following PR.


crates/starknet_integration_tests/src/utils.rs line 237 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

Why not get_available_port?

What do you mean? I didnt find a func with this name.


crates/starknet_sequencer_node/src/components.rs line 138 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

Are these the formal component names?

Yes

Code snippet (i):

pub struct StateSync {
    pub request_sender: mpsc::Sender<(StateSyncRequest, oneshot::Sender<StateSyncResponse>)>,
}

Code snippet (ii):

pub struct StateSyncRunner {
    #[allow(dead_code)]
    request_receiver: mpsc::Receiver<(StateSyncRequest, oneshot::Sender<StateSyncResponse>)>,
    #[allow(dead_code)]
    storage_reader: StorageReader,
    network_future: BoxFuture<'static, Result<(), NetworkError>>,
    // TODO: change client and server to requester and responder respectively
    p2p_sync_client_future: BoxFuture<'static, Result<(), P2PSyncClientError>>,
    p2p_sync_server_future: BoxFuture<'static, ()>,
}

crates/starknet_sequencer_node/src/config/config_test.rs line 79 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

Context for this please 🙏

Will add to the comment.


crates/starknet_sequencer_node/src/config/config_test.rs line 119 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

Context please.

Will add to the comment.

Copy link
Contributor

@Itay-Tsabary-Starkware Itay-Tsabary-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 13 files reviewed, 5 unresolved discussions (waiting on @noamsp-starkware)


crates/starknet_integration_tests/src/utils.rs line 237 at r1 (raw file):

Previously, noamsp-starkware wrote…

What do you mean? I didnt find a func with this name.

get_available_socket, apologies.


crates/starknet_sequencer_node/src/components.rs line 138 at r1 (raw file):

Previously, noamsp-starkware wrote…

Yes

The mempool_p2p module also has two components: the propagator and the runner. Can we stick to this convention, i.e., that we have state_sync_X and state_sync_Y. Otherwise this will get too messy to rememeber which component is from which module.


crates/starknet_sequencer_node/src/config/config_test.rs line 79 at r1 (raw file):

Previously, noamsp-starkware wrote…

Will add to the comment.

I meant in a broader spec: why does config validation is related to existence of data directories?

@ShahakShama
Copy link
Contributor

crates/starknet_sequencer_node/src/config/config_test.rs line 79 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

I meant in a broader spec: why does config validation is related to existence of data directories?

Is this what they do in batcher?

Copy link
Contributor

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 13 files at r1, 1 of 2 files at r2, all commit messages.
Reviewable status: 11 of 13 files reviewed, 6 unresolved discussions (waiting on @Itay-Tsabary-Starkware and @noamsp-starkware)


crates/starknet_integration_tests/src/state_reader.rs line 59 at r1 (raw file):

Previously, noamsp-starkware wrote…

I'll check this. If so i'll add a TODO/remove them in a following PR.

Only the rpc one. The batcher one will remain. The rpc one will be deleted within hopefully a week or two
Please add the appropriate TODO


crates/starknet_integration_tests/src/utils.rs line 237 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

get_available_socket, apologies.

This function is used in the dump_config exe. The fact that the port is available in the dump_config exe doesn't mean it will be available when we run the node


crates/starknet_sequencer_node/src/components.rs line 138 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

The mempool_p2p module also has two components: the propagator and the runner. Can we stick to this convention, i.e., that we have state_sync_X and state_sync_Y. Otherwise this will get too messy to rememeber which component is from which module.

I'd love to discuss this further in huddle or f2f, but it's unrelated to this PR and IMO shouldn't block


crates/starknet_integration_tests/src/state_reader.rs line 82 at r2 (raw file):

            .chain_id(chain_info.chain_id.clone())
            .build();
        create_test_state(&mut state_sync_storage_writer, chain_info, test_defined_accounts);

@Itay-Tsabary-Starkware is this function deterministic? will it return the same storage for the sync and the batcher?

Copy link
Contributor

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 7 unresolved discussions (waiting on @Itay-Tsabary-Starkware and @noamsp-starkware)


config/sequencer/default_config.json line 1095 at r2 (raw file):

    "description": "Prefix of the path of the node's storage directory, the storage file path will be <path_prefix>/<chain_id>. The path is not created automatically.",
    "privacy": "Public",
    "value": "./data"

This collides with papyrus. Change to ./sequencer_data

@noamsp-starkware noamsp-starkware force-pushed the noam.s/refactor_state_sync_to_use_p2p_sync branch from f1589c9 to 5e44f1b Compare December 9, 2024 12:51
Copy link
Contributor Author

@noamsp-starkware noamsp-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 7 unresolved discussions (waiting on @Itay-Tsabary-Starkware and @ShahakShama)


crates/starknet_integration_tests/src/state_reader.rs line 59 at r1 (raw file):

Previously, ShahakShama wrote…

Only the rpc one. The batcher one will remain. The rpc one will be deleted within hopefully a week or two
Please add the appropriate TODO

Done.


crates/starknet_sequencer_node/src/config/config_test.rs line 79 at r1 (raw file):

Previously, ShahakShama wrote…

Is this what they do in batcher?

@Itay-Tsabary-Starkware The validation verifies the paths specified in the config actually exist. ./data is the default and intended path for the state sync storage (this will be refactored to ./sequencer_data).

@ShahakShama In batcher they use a . as the default storage directory, so no need to create this file explicitly.

Copy link
Contributor Author

@noamsp-starkware noamsp-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 7 unresolved discussions (waiting on @Itay-Tsabary-Starkware and @ShahakShama)


config/sequencer/default_config.json line 1095 at r2 (raw file):

Previously, ShahakShama wrote…

This collides with papyrus. Change to ./sequencer_data

Ok

Copy link
Contributor

@Itay-Tsabary-Starkware Itay-Tsabary-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 6 unresolved discussions (waiting on @noamsp-starkware and @ShahakShama)


crates/starknet_integration_tests/src/state_reader.rs line 59 at r1 (raw file):

Previously, noamsp-starkware wrote…

Done.

Add a name to the todo please 🙏


crates/starknet_integration_tests/src/state_reader.rs line 82 at r2 (raw file):

Previously, ShahakShama wrote…

@Itay-Tsabary-Starkware is this function deterministic? will it return the same storage for the sync and the batcher?

Yes. And also for the rpc.
But please note the StorageScope::FullArchive parameter that affect the returned storage.


crates/starknet_integration_tests/src/utils.rs line 237 at r1 (raw file):

Previously, ShahakShama wrote…

This function is used in the dump_config exe. The fact that the port is available in the dump_config exe doesn't mean it will be available when we run the node

I don't understand the comment, nor the desired behavior. Could you please elaborate on both?


crates/starknet_sequencer_node/src/config/config_test.rs line 79 at r1 (raw file):

Previously, noamsp-starkware wrote…

@Itay-Tsabary-Starkware The validation verifies the paths specified in the config actually exist. ./data is the default and intended path for the state sync storage (this will be refactored to ./sequencer_data).

@ShahakShama In batcher they use a . as the default storage directory, so no need to create this file explicitly.

This validation is internal to the component, this does not belong in the globabl config validation scope imo. Please move it to a relevant module. The batcher has/had a similar validation, check it out for inspiration.

Copy link
Contributor

@Itay-Tsabary-Starkware Itay-Tsabary-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 6 unresolved discussions (waiting on @noamsp-starkware and @ShahakShama)


crates/starknet_sequencer_node/src/config/config_test.rs line 119 at r1 (raw file):

Previously, noamsp-starkware wrote…

Will add to the comment.

As per my suggestion/request above, this should be a part of the internal test / setup phase of the state sync component.

Copy link
Contributor

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 6 unresolved discussions (waiting on @Itay-Tsabary-Starkware and @noamsp-starkware)


config/sequencer/default_config.json line 1095 at r2 (raw file):

Previously, noamsp-starkware wrote…

Ok

This is not changed. Also, Add /sequencer_data to .gitignore

Copy link
Contributor

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 6 unresolved discussions (waiting on @Itay-Tsabary-Starkware, @noamsp-starkware, and @yair-starkware)


crates/starknet_integration_tests/src/utils.rs line 237 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

I don't understand the comment, nor the desired behavior. Could you please elaborate on both?

@yair-starkware also encountered this.
The issue is that what you're suggesting is to call get_available_socket() in every config we have ports in it, but the issue is that we first create the entire config and then run the node so the port isn't taken after we call get_available_socket

I think I have a different solution, we'll create a test utility for get_available_socket that holds a lazy static mutex for a vector/hashset of taken ports and whenever it is called it locks the mutex, gets a port that doesn't appear there and adds it there and returns the port

WDYT

Copy link
Contributor

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 13 files reviewed, 5 unresolved discussions (waiting on @Itay-Tsabary-Starkware, @noamsp-starkware, and @yair-starkware)


crates/starknet_integration_tests/src/state_reader.rs line 59 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

Add a name to the todo please 🙏

You can put my name if you want


crates/starknet_sequencer_node/src/config/config_test.rs line 79 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

This validation is internal to the component, this does not belong in the globabl config validation scope imo. Please move it to a relevant module. The batcher has/had a similar validation, check it out for inspiration.

As discussed in slack, IMO we should remove this validation and make it so that if the path doesn't exist, open_storage will create the directory in that path

@noamsp-starkware noamsp-starkware force-pushed the noam.s/create_state_sync_component branch from 8ca4162 to f9a9c47 Compare December 10, 2024 09:25
@noamsp-starkware noamsp-starkware changed the base branch from noam.s/refactor_state_sync_to_use_p2p_sync to main December 10, 2024 09:25
Copy link
Contributor

@Itay-Tsabary-Starkware Itay-Tsabary-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 17 files reviewed, 4 unresolved discussions (waiting on @noamsp-starkware, @ShahakShama, and @yair-starkware)


crates/starknet_integration_tests/src/utils.rs line 237 at r1 (raw file):

Previously, ShahakShama wrote…

@yair-starkware also encountered this.
The issue is that what you're suggesting is to call get_available_socket() in every config we have ports in it, but the issue is that we first create the entire config and then run the node so the port isn't taken after we call get_available_socket

I think I have a different solution, we'll create a test utility for get_available_socket that holds a lazy static mutex for a vector/hashset of taken ports and whenever it is called it locks the mutex, gets a port that doesn't appear there and adds it there and returns the port

WDYT

Ack about the problem.
Let's discuss solutions in a different scope.
For the mean time, please put this constant in a visible place (i.e., not within a function body mid-module, but at the top), with a clearly marked todo to get rid of it 🙏 (please include "Tsabary" in the todo as well)

@noamsp-starkware noamsp-starkware force-pushed the noam.s/create_state_sync_component branch from f9a9c47 to 8f94349 Compare December 10, 2024 10:30
Copy link
Contributor Author

@noamsp-starkware noamsp-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 17 files reviewed, 2 unresolved discussions (waiting on @ShahakShama and @yair-starkware)


config/sequencer/default_config.json line 1095 at r2 (raw file):

Previously, ShahakShama wrote…

This is not changed. Also, Add /sequencer_data to .gitignore

Done.


crates/starknet_integration_tests/src/state_reader.rs line 59 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

Add a name to the todo please 🙏

Done.


crates/starknet_integration_tests/src/utils.rs line 237 at r1 (raw file):

Previously, Itay-Tsabary-Starkware wrote…

Ack about the problem.
Let's discuss solutions in a different scope.
For the mean time, please put this constant in a visible place (i.e., not within a function body mid-module, but at the top), with a clearly marked todo to get rid of it 🙏 (please include "Tsabary" in the todo as well)

Done.

@noamsp-starkware noamsp-starkware force-pushed the noam.s/create_state_sync_component branch from 8f94349 to b14087c Compare December 10, 2024 10:45
Copy link
Contributor

@Itay-Tsabary-Starkware Itay-Tsabary-Starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 13 files at r1, 7 of 14 files at r3, 7 of 7 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ShahakShama)

Copy link
Contributor

@yair-starkware yair-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 13 files at r1, 7 of 14 files at r3, 7 of 7 files at r5, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @noamsp-starkware and @ShahakShama)


crates/papyrus_storage/src/lib.rs line 170 at r5 (raw file):

    {
        fs::create_dir_all(storage_config.db_config.path_prefix.clone())?;
        info!("Created storage directory: {:?}", storage_config.db_config.path());

Please use the display format ({} instead of {:?} )

Code quote:

:?

@noamsp-starkware noamsp-starkware force-pushed the noam.s/create_state_sync_component branch from b14087c to 3e46906 Compare December 10, 2024 12:27
Copy link
Contributor

@ShahakShama ShahakShama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 7 of 14 files at r3, 1 of 1 files at r6, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @noamsp-starkware)

@noamsp-starkware noamsp-starkware merged commit 9a29359 into main Dec 12, 2024
23 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Dec 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants