Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store trielog as blobdb #6289

Merged
merged 20 commits into from
Jan 31, 2024
Merged

Conversation

usmansaleem
Copy link
Member

@usmansaleem usmansaleem commented Dec 13, 2023

PR description

Store trielog as blobdb to improve performance. #5475 has shown that blobdb is better choice for static data as it reduced the rocksdb write amplifications during compaction. Manually tested upgrade and downgrade.

Graph showing bytes written during compaction. The yellow bar represents the node with "trielog as blobdb" PR applied (blobdb-2), the green bar represents the node without the blobdb patch (blobdb-3). On average, bytes written during compaction is lower for "trielog as blobdb vs trielog w/o blobdb":

image

Following graph shows fork choice update time (24 hours). There doesn't seem to be any negative impact of writing trielog as blobdb.

image

Fixed Issue(s)

See #5866

Copy link

github-actions bot commented Dec 13, 2023

  • I thought about documentation and added the doc-change-required label to this PR if updates are required.
  • I thought about the changelog and included a changelog update if required.
  • If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

@usmansaleem usmansaleem marked this pull request as ready for review January 16, 2024 22:04
Signed-off-by: Usman Saleem <[email protected]>
@usmansaleem usmansaleem added TeamGroot GH issues worked on by Groot Team and removed TeamGroot GH issues worked on by Groot Team labels Jan 16, 2024
@siladu
Copy link
Contributor

siladu commented Jan 19, 2024

If my PR includes database changes (e.g. KeyValueSegmentIdentifier) I have thought about compatibility and performed forwards and backwards compatibility tests

How was this part tested?

@siladu
Copy link
Contributor

siladu commented Jan 19, 2024

@usmansaleem I'm not really sure what the graph in the description is showing, can you add some explanation please?

Also, have a feeling we discussed it last year, but can't remember what the outcome of the following test was?

Verify if pruning has an impact on performance due to compaction #5866

@siladu
Copy link
Contributor

siladu commented Jan 19, 2024

@usmansaleem I'm not really sure what the graph in the description is showing, can you add some explanation please?

ah, just spotted #5866 (comment)

Have you got a graph for the trie log pruning enabled?


Last one...

Should we have a feature flag for this or should it be enabled by default?

I guess the answer to this was enabled by default because it's possible to downgrade to an older version of the db?

@jframe
Copy link
Contributor

jframe commented Jan 19, 2024

Store trielog as blobdb to improve performance. #5475 has shown that blobdb is better choice for static data as it reduced the rocksdb write amplifications during compaction. Manually tested upgrade and downgrade.

Do you have a graph on the impact on the engine API FCU and new payload?

Copy link
Contributor

@fab-10 fab-10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since after #6445 there can be deletions to this column family, there is the need to enable, and tune, blob garbage collection for it, since now it is it disabled by default (blockchain data cf is meant to be append only with no deletions*)

  • there is an experimental feature for pruning the blockchain data as well, so enabling blob garbage collections is useful also there

@usmansaleem usmansaleem requested a review from fab-10 January 29, 2024 01:21
Signed-off-by: Usman Saleem <[email protected]>
Signed-off-by: Usman Saleem <[email protected]>
Signed-off-by: Usman Saleem <[email protected]>
…nterface has introduced new method

Signed-off-by: Usman Saleem <[email protected]>
Copy link
Contributor

@fab-10 fab-10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
It maybe be required to tune the following two options if the trielog purge is enabled and the space is not reclaimed quicky enough (cc @siladu )

  • blob_garbage_collection_age_cutoff: the cutoff that the GC logic uses to determine which blob files should be considered “old.” For example, the default value of 0.25 signals to RocksDB that blobs residing in the oldest 25% of blob files should be relocated by GC. This parameter can be tuned to adjust the trade-off between write amplification and space amplification.

  • blob_garbage_collection_force_threshold: if the ratio of garbage in the oldest blob files exceeds this threshold, targeted compactions are scheduled in order to force garbage collecting the blob files in question, assuming they are all eligible based on the value of blob_garbage_collection_age_cutoff above. This can help reduce space amplification in the case of skewed workloads where the affected files would not otherwise be picked up for compaction. This option is currently only supported with leveled compactions.

Signed-off-by: Usman Saleem <[email protected]>

# Conflicts:
#	CHANGELOG.md
#	plugin-api/build.gradle
@usmansaleem usmansaleem enabled auto-merge (squash) January 31, 2024 09:46
@usmansaleem usmansaleem merged commit c10cb3d into hyperledger:main Jan 31, 2024
18 checks passed
Gabriel-Trintinalia pushed a commit to Gabriel-Trintinalia/besu that referenced this pull request Feb 1, 2024
Store trielog as blobdb to improve performance.

* Store trie_log_storage as blob
* changelog
* Add gc flag for static data
* Updating plugin-api build.gradle with new hash as SegmentIdentifier interface has introduced new method

Signed-off-by: Usman Saleem <[email protected]>
Signed-off-by: Gabriel-Trintinalia <[email protected]>
@usmansaleem usmansaleem deleted the trielog_blobdb_fixed branch February 1, 2024 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants