about the gpu memory when training the pointnet++ #1

flandrewries · 2022-08-04T08:00:35Z

hello,
I'm trying training the feature extraction block with the code you have released,
but i find if we use (4, 3, 10000) as the input,
the PointNetSetAbstraction layer could takes too much gpu memory (over 12GB by 3080Ti),
as the dist_matrix could be (4, 10000, 10000) if the npoints set to 10000,
I'm wondering if you have any good idea to solve this problem.
thanks too much !

vivcheng01 · 2022-08-05T00:03:42Z

Hi,
I think we ran into a similar out of memory issue when training. Maybe decrease the 10000 to 5000 and try it out again.

flandrewries · 2022-08-05T13:07:42Z

I have find the solution for this problem;
main question is the inefficiency of the multiply matrix and QueryAndGroup operation,
so delicate CUDA ops may helps for this;
This version of Pointnet2 can solve this:
git clone https://github.com/sshaoshuai/Pointnet2.PyTorch.git
if you use CUDA 11, a pull request version of this may also be used.
After replaced the code, a tensor with shape of [16, 10000, 10000] can be trained in 12GB device.
thanks a lot !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about the gpu memory when training the pointnet++ #1

about the gpu memory when training the pointnet++ #1

flandrewries commented Aug 4, 2022

vivcheng01 commented Aug 5, 2022

flandrewries commented Aug 5, 2022

about the gpu memory when training the pointnet++ #1

about the gpu memory when training the pointnet++ #1

Comments

flandrewries commented Aug 4, 2022

vivcheng01 commented Aug 5, 2022

flandrewries commented Aug 5, 2022