Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract_mesh.py - CUDA memory allocation exceeding 100+ GiB #96

Open
andreasHovaldt opened this issue Nov 28, 2024 · 2 comments
Open

extract_mesh.py - CUDA memory allocation exceeding 100+ GiB #96

andreasHovaldt opened this issue Nov 28, 2024 · 2 comments

Comments

@andreasHovaldt
Copy link

andreasHovaldt commented Nov 28, 2024

Hello,
Firstly thank you for this amazing contribution :)

I have been trying to get a mesh extraction pipeline up and running, thus currently I have only been trying out GOF on the Mip-NeRF 360 scenes. The training process works perfectly, but when I try to run "extract_mest.py" on the scenes, I run into problems with incredibly high memory allocation requirements.
Any help with this problem would be highly appreciated!

Command

OMP_NUM_THREADS=4 CUDA_VISIBLE_DEVICES=0 python extract_mesh.py -m exp_360/release/garden --iteration 30000

Output

Looking for config file in exp_360/release/garden/cfg_args
Config file found: exp_360/release/garden/cfg_args
Rendering exp_360/release/garden
Loading trained model at iteration 30000
Reading camera 185/185
Loading Training Cameras
Traceback (most recent call last):
  File "/home/andreas/Github/gaussian-opacity-fields/extract_mesh.py", line 163, in <module>
    extract_mesh(model.extract(args), args.iteration, pipeline.extract(args), args.filter_mesh, args.texture_mesh, args.near, args.far)
  File "/home/andreas/Github/gaussian-opacity-fields/extract_mesh.py", line 141, in extract_mesh
    marching_tetrahedra_with_binary_search(dataset.model_path, "test", iteration, cams, gaussians, pipeline, background, kernel_size, filter_mesh, texture_mesh, near, far)
  File "/home/andreas/miniforge3/envs/gof2/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/andreas/Github/gaussian-opacity-fields/extract_mesh.py", line 43, in marching_tetrahedra_with_binary_search
    points, points_scale = gaussians.get_tetra_points(views, near, far)
  File "/home/andreas/miniforge3/envs/gof2/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/andreas/Github/gaussian-opacity-fields/scene/gaussian_model.py", line 462, in get_tetra_points
    vertex_mask = get_frustum_mask(vertices, views, near, far)
  File "/home/andreas/miniforge3/envs/gof2/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/andreas/Github/gaussian-opacity-fields/scene/gaussian_model.py", line 56, in get_frustum_mask
    view_points = einsum(view_matrices, homo_points, "n_view b c, N c -> n_view N b")
  File "/home/andreas/miniforge3/envs/gof2/lib/python3.9/site-packages/einops/einops.py", line 907, in einsum
    return get_backend(tensors[0]).einsum(pattern, *tensors)
  File "/home/andreas/miniforge3/envs/gof2/lib/python3.9/site-packages/einops/_backends.py", line 287, in einsum
    return self.torch.einsum(pattern, *x)
  File "/home/andreas/miniforge3/envs/gof2/lib/python3.9/site-packages/torch/functional.py", line 402, in einsum
    return _VF.einsum(equation, operands)  # type: ignore[attr-defined]
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 139.69 GiB. GPU 0 has a total capacity of 23.65 GiB of which 16.11 GiB is free. Including non-PyTorch memory, this process has 7.52 GiB memory in use. Of the allocated memory 6.09 GiB is allocated by PyTorch, and 971.39 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

This example was taken when I tried to run python scripts/run_mipnerf360.py, but I have also tried manually running the train.py and extract_mesh.py scripts which end up with a similar error message.

Edit: I have now done it on a handful of the different scenes in the Mip-NeRF 360 dataset, and the memory allocation crashes happen with allocations ranging from 40GiB to 140GiB.

Runtime info

Graphics Card: RTX 4090
Operating System: Ubuntu 24.04.1 LTS x86_64
nvcc version: Build cuda_12.4.r12.4/compiler.34097967_0
C++ compiler: 9.5.0

@niujinshuchong
Copy link
Member

Hi, thanks for reporting this. I merged the code without testing it on a larger scenes. You can try to use a previous commit a3ca467 or you can update the this part with a for loop: https://github.com/autonomousvision/gaussian-opacity-fields/blob/main/scene/gaussian_model.py#L54-L71.

@andreasHovaldt
Copy link
Author

Thank you for the response, I switched to the mentioned commit which fixed the problem :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants