-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve debug_traceBlock calls performance #8076
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few suggestions in comments. I'll leave unapproved to make sure someone more experienced in the area gets to review too.
} | ||
return Optional.of(tracesList); | ||
})) | ||
.orElse(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nicer to return an empty collection rather than null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would change the rpc call output from
{"jsonrpc":"2.0","id":1,"result":null}
to
{"jsonrpc":"2.0","id":1,"result":[]}
It was the original behaviour before this PR and would like to keep it as it is.
block.getHash(), | ||
traceableState -> { | ||
Collection<DebugTraceTransactionResult> tracesList = | ||
new CopyOnWriteArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we're using CopyOnWriteArrayList
? It seems wasteful to be copying the entire array every time we add an element, rather than adding a more substantial array size less often, as ArrayList
does.
If it's just because CopyOnWriteArrayList
is threadsafe, why not use Collections.synchronizedList to produce a synchronized ArrayList
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right, Collections.synchronizedList is a better choice here as CopyOnWriteArrayList is not suitable for heavy write loads, which is the case here. I tend to not use Collections.synchronized* but in this case it is a better choice, even if CopyOnWriteArrayList doesn't lock on reads. Collections.synchronized* synchronize (lock) on each method call.
@@ -153,4 +148,39 @@ public int hashCode() { | |||
result = 31 * result + Arrays.hashCode(stack); | |||
return result; | |||
} | |||
|
|||
public static String toCompactHex(final Bytes abytes, final boolean prefix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason we can't use Bytes.toHexString
or Bytes.toFastHex
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is explained in the description. We were using toShortHexString
and it was one of the biggest bottelnecks. As we're currently migrating tuweni, I decided to create this method directly in the StructLog java class. It should be replaced in the future Bytes.toFastHex
implementation.
Change the implementation of toShortHexString, by creating a new method toCompactHex. The main improvement is to not rely on Bytes, but on byte[] to avoid dynamic dispatch mechanism related to the use of methods get(i) and size(). Those methods are not used anymore thanks to the switch to StringBuilder, even if it has its own cons. This method will replace in the future toFastHex implementation, once we have finished the work on Tuweni.
.getAndMapWorldState(any(), any()); | ||
when(blockchainQueries.getBlockHeaderByHash(any(Hash.class))) | ||
.thenReturn(Optional.of(blockHeader)); | ||
MockitoAnnotations.openMocks(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nicer to just add @ExtendWith(MockitoExtension.class)
annotation to the class (and optionally @MockitoSettings(strictness = Strictness.LENIENT)
if desired
|
||
@BeforeEach | ||
public void setUp() { | ||
MockitoAnnotations.openMocks(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too
|
||
@BeforeEach | ||
public void setUp() { | ||
MockitoAnnotations.openMocks(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too
|
||
@BeforeEach | ||
public void setUp() { | ||
MockitoAnnotations.openMocks(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an initial pass. Would like to dive deeper into the pipeline changes but feel free to merge if someone else approves
this.outputCounter = | ||
metricsSystem.createLabelledCounter( | ||
BesuMetricCategory.BLOCKCHAIN, | ||
"transactions_debugTraceblock_pipeline_processed_total", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is the casing correct here?
.andFinishWith("collect_results", tracesList::add); | ||
|
||
try { | ||
if (getBlockchainQueries().getEthScheduler().isPresent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a realistic scenario where EthScheduler isn't available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was inherited from Trace_block implementation. I changed both Trace* and DebugTrace* implementations to use the Besu ethScheduler : 6a541e5, created when we start Besu. Now, we don't need anymore to check if it null or not.
EthScheduler ethScheduler = | ||
new EthScheduler(1, 1, 1, 1, new NoOpMetricsSystem()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little mysterious, might be good to name the variable something like "singleThreadedEthScheduler" perhaps? Don't we want metrics too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the implementation and deleted this code 6a541e5
...java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/methods/DebugTraceBlockByNumber.java
Outdated
Show resolved
Hide resolved
.flatMap( | ||
hash -> | ||
block -> | ||
Tracer.processTracing( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't this reuse the same call made in AbstractDebugTraceBlock (as ByHash appears to)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question, this is because DebugTraceBlockByNumber already inherit from AbstractBlockParameterMethod, to handle flags related to block number, like latest, safe or finalized.
.../src/test/java/org/hyperledger/besu/ethereum/api/jsonrpc/internal/results/StructLogTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what's happening with the changes in this file please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes made in DebugOperationTracer are related to not displaying storage and memory in structlog json document if they are empty. This is to match Geth behaviour and reduce Json output size.
Just referencing here https://github.com/Consensys/protocol-misc/issues/1010, to be sure |
To validate the changes with the pipeline, during the testing phase I created a new RPC method |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely some good changes here
new EthScheduler(1, 1, 1, 1, new NoOpMetricsSystem()); | ||
ethScheduler.startPipeline(traceBlockPipeline).get(); | ||
} | ||
ethScheduler.startPipeline(traceBlockPipeline).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much cleaner, very nice
@@ -140,7 +141,8 @@ public void initServerAndClient() throws Exception { | |||
vertx, | |||
mock(ApiConfiguration.class), | |||
Optional.empty(), | |||
mock(TransactionSimulator.class)); | |||
mock(TransactionSimulator.class), | |||
new EthScheduler(1, 1, 1, new NoOpMetricsSystem())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider using DeterministicEthScheduler as done in most other unit tests requiring an EthScheduler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that DeterministicEthScheduler caused a performance issue on some trace calls unit tests on trace_block, trace_filter and trace_replayBlockTransactions. Unit tests were failing because of a timeout. I rollbacked that change on that specific test.
@macfarla @siladu @Gabriel-Trintinalia Would you mind have a look to this PR ? |
9a5330d
to
8cddcfd
Compare
I closed this PR by error and I couldn't reopen it even by adding more commits. |
PR description
This PR introduces several key improvements to the
debug_traceBlock
,debug_traceBlockByNumber
, anddebug_traceBlockByHash
methods, significantly enhancing both performance and memory usage:The core change is the adoption of a pipeline architecture, where:
Memory usage is optimized by the pipeline processing approach. Instead of retaining all trace frames for every transaction, the pipeline now keeps only the trace frames relevant to the current transaction, significantly reducing memory consumption.
The transformation of trace frames into Java JSON representations now occurs in parallel.
Default trace options have been adjusted to align with the behavior of Geth and Nethermind:
The structure of the JSON output has been modified to match Geth's format:
Change the implementation of
toShortHexString
, by creating a new methodtoCompactHex
. The main improvement is to not relay on Bytes, but on byte[] to avoid dynamic dispatch mechanism related to the use of methodsget(i)
andsize().
Those methods are not used anymore thanks to the switch to StringBuilder, even if it has its own cons. This method will replace in the futuretoFastHex
implementation, once we have finished the work on Tuweni.These changes have resulted in a 3X performance improvement, reducing the size of the output by 2 to 3 times, making previously unresponsive calls fast enough with comparable performance with other clients. The optimizations not only improve response times but also allow for smoother operation with lower memory usage, even when processing large traces.
Overall, the new approach ensures better performance, reduced resource usage, and more consistent output with other Ethereum clients, particularly Geth and Nethermind.
Fixed Issue(s)
Fixes #5322
Thanks for sending a pull request! Have you done the following?
doc-change-required
label to this PR if updates are required.Locally, you can run these tests to catch failures early:
./gradlew build
./gradlew acceptanceTest
./gradlew integrationTest
./gradlew ethereum:referenceTests:referenceTests