Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about the gpu memory when training the pointnet++ #1

Open
flandrewries opened this issue Aug 4, 2022 · 2 comments
Open

about the gpu memory when training the pointnet++ #1

flandrewries opened this issue Aug 4, 2022 · 2 comments

Comments

@flandrewries
Copy link

hello,
I'm trying training the feature extraction block with the code you have released,
but i find if we use (4, 3, 10000) as the input,
the PointNetSetAbstraction layer could takes too much gpu memory (over 12GB by 3080Ti),
as the dist_matrix could be (4, 10000, 10000) if the npoints set to 10000,
I'm wondering if you have any good idea to solve this problem.
thanks too much !

@vivcheng01
Copy link
Owner

Hi,
I think we ran into a similar out of memory issue when training. Maybe decrease the 10000 to 5000 and try it out again.

@flandrewries
Copy link
Author

I have find the solution for this problem;
main question is the inefficiency of the multiply matrix and QueryAndGroup operation,
so delicate CUDA ops may helps for this;
This version of Pointnet2 can solve this:
git clone https://github.com/sshaoshuai/Pointnet2.PyTorch.git
if you use CUDA 11, a pull request version of this may also be used.
After replaced the code, a tensor with shape of [16, 10000, 10000] can be trained in 12GB device.
thanks a lot !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants