Make Box2d::contains and Box2d intersects faster #526
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After stumbling upon https://www.romainguy.dev/posts/2024/down-a-rabbit-hole/ which applies the same optimization to equivalent Kotlin code I gave it a try in euclid and it indeed makes for quite a nice speedup:
Chaining simple && conditions produces branch instructions in some cases. Replacing the logical
&&
with bitwise&
to ensure the compiler does not produce branches makes a big (70~80%) performance difference.Generated assembly
intersects
before:intersects
after:contains
before:contains
after:Benchmark code
Results in code comments.