You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 30, 2024. It is now read-only.
As I commented in that thread, I can only get up to 8.5it/s (without his caching strategy). I didn't test further. He mentioned caching gave him +4 it/s, maybe that is the trick? But in my implementation data loading is not bottleneck at all (it's about 2e-4 sec to fetch a batch) so I didn't try it.
Missed out on this issue. I haven't been able to follow up on that thread as I haven't been able to run additional experiments since. For the config though, it was PyTorch 1.4, CUDA 10.1, python 3.6. I was running on a GPU-cluster, and in particular, on a node that had a V100 GPU.
As for the config, I took care to replicate the exact lego config file (network of 8 layers, 128 fine samples per ray, and the like). And for speed comparison, I use the time taken until the optimizer parameter update (and not the tqdm loop reported times, which include tensorboard logging, etc.)
May I know the exact config (hyperparameters) to get an average speed of 14.2 it/s (on an RTX2080Ti, PyTorch 1.4.0, CUDA 9.2) reported here?
I couldn't get it by simply following modifications in #6.
(cc @kwea123, did you test it further?)
Thanks in advance!
The text was updated successfully, but these errors were encountered: