You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Performance benchmarking shows that as is, the KDTree construction is the bottleneck of the entire algorithm. A naive approach to parallelization could be to distribute the subtree construction to threads. The very coarse levels do not parallelize well in this algorithm, but a sufficiently deep tree mitigates this disadvantage.
Implementation includes two non-trivial aspects:
Review nanoflann's internal data structures w.r.t. thread safety as we are operating outside of nanoflann's communicated thread safetly guarantee. However, we are only using the static version of nanoflann (the dynamic would definitely not be thread-safe).
Add a task-based OpenMP parallelization to the recursive construction algorithm.
The text was updated successfully, but these errors were encountered:
However, my initial testing was far off the advocated speedup of 3. I got around 20% for medium core counts and the code got slower for large core counts. I am currently hesitant to include that in the code base, at least not without a user interface to control.
Performance benchmarking shows that as is, the KDTree construction is the bottleneck of the entire algorithm. A naive approach to parallelization could be to distribute the subtree construction to threads. The very coarse levels do not parallelize well in this algorithm, but a sufficiently deep tree mitigates this disadvantage.
Implementation includes two non-trivial aspects:
nanoflann
's internal data structures w.r.t. thread safety as we are operating outside ofnanoflann
's communicated thread safetly guarantee. However, we are only using the static version of nanoflann (the dynamic would definitely not be thread-safe).The text was updated successfully, but these errors were encountered: