This repository has been archived by the owner on Oct 9, 2023. It is now read-only.
Too much RAM usage by ImageClassificationData #1442
Unanswered
Hravan
asked this question in
Data / pipelines
Replies: 1 comment 2 replies
-
Hi @Hravan thanks for reporting this! That definitely seems like a bug. We have made several improvements in the latest release (out today) that should reduce the memory consumption of Flash. Could you try upgrading to Hope that helps 😃 |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm setting up a training for this kaggle competition dataset: https://www.kaggle.com/competitions/plant-pathology-2021-fgvc8
(I'm using here only samples with single labels to make the problem simpler)
The problem is that the ImageClassificationData takes too much RAM and GPU is underutilized. I wrote the code in plain PyTorch for comparison to confirm that the problem is somewhere within ImageClassificationData.
Code shared by both training versions:
Training in plain PyTorch:
Training in Lightning Flash:
When I increase bach_size to 64 or num_workers to 16 in ImageClassificationData, I start having problems with RAM, which does not happen for the plain PyTorch version. Any ideas what might be the problem? I tried profiling, but didn't get to any sensible conclusion, except that I bet the problem is in BaseDataFetcher in DataModule.
Beta Was this translation helpful? Give feedback.
All reactions