Enable queue phase parallelization by using Parallel to back RenderPhase #11984
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Objective
Part 2 on the quest to resolve #3548. Many of the queue and prepare systems are unable to run in parallel, creating a potential CPU bottleneck when dense and varied scenes.
Solution
Use
Parallel
, introduced in #7348, to provide a cleaner implementation of #4899's thread-local queues, so systems can add phase items to the queue without needing a mutable reference to the render phase. As multiple systems with read-only access can parallelize under Bevy's system executor, many of these systems are "read-only" in the eyes of the scheduler, and thus do not block them from running in parallel with each other.This potentially adds a bit of additional overhead on a per item basis, due to needing to fetch the thread-local queue for each item, though it may be mitigable by providing a
RenderPhase::extend
option that takes an iterator instead of adding items one by one.As a additional optimization, this PR includes a change to
Parallel::drain_into
to avoid copying if the outputVec
is known to be empty to avoid the copying overhead when there's only one system that enqueued items before sorting.IMO, this design is much cleaner than the one in #4899, though it may be a bit rough to grok for people seeing it for the first time.
Performance
TODO
Changelog
Changed:
RenderPhase::add
now only requires aChanged:
RenderPhase`'s fields are now privateMigration Guide
TODO