Replies: 6 comments 17 replies
-
cuspatial PIP test should not take 45s for testing 1B points in a single polygon with 1000 vertices. Python overhead and CPU->GPU transfer are likely the dominating factors. |
Beta Was this translation helpful? Give feedback.
-
Hi @epifanio! Appreciate you taking the time to do this! A few things come to my mind here:
To @zhangjianting's point, curious if consecutive runs of the cuSpatial PIP lower the time (should help show kernel launch overhead). In general, right now we're really focused on increasing the reach of cuSpatial, then we can implement further accelerations, but regardless this is valuable and we really appreciate the feedback. I hope you find the time to test using the quadtree. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Thanks for looking at |
Beta Was this translation helpful? Give feedback.
-
Sounds like @jarmak-nv is right about the architectural limitations @epifanio . Here's my modifications to get it up to the latest API spec and I only iterate twice, but on my Xeon 3.4 Ghz with 6 cores and 12 threads, running a GV 100 GPU (32GB) I got 10x faster with cuspatial than using your awesome numba kernel:
[0.43572211265563965, 1.0868034362792969, 2.0259852409362793]
[0.722322940826416, 9.562219381332397, 20.059306144714355]
|
Beta Was this translation helpful? Give feedback.
-
I'm looking more closely at your code here
Using the older API. I believe it should read
|
Beta Was this translation helpful? Give feedback.
-
@epifanio I found the SO answer you mentioned but didn't link. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I am working on a point in a polygon problem, and I cam across
cuspatial
point in polygon implementation, and I compared it with other solutions found on the web. I am reporting my findings on a SO answer , but I thought it was also worth mentioning here.For comparing methods, I coded the CPU algorithms using numba and reproduced some results with synthetic data - code follows:
is_inside_sm
fastest method on the CPU:Comparison:
which lead to the following results:
Hardware specs:
Is the "is_inside_sm" method portable to GPU? Will
cuspatial
benefit from it ?I understand that the
cuspatial
method offers much more flexibility and can handle perhaps complex polygonal shapes - but for a simple scenario, theis_inside_sm
seems more convenient as it doesn't require a complex environment and dedicated hardware. I wonder how this will compare with the Quad-Tree based method in cuspatial - will someone like to extend this example with synthetic data, including it?Beta Was this translation helpful? Give feedback.
All reactions