Replies: 2 comments 4 replies
-
Hi @janaccnt, could you further reduce the size of the model and training/test dataset and see if it works? |
Beta Was this translation helpful? Give feedback.
4 replies
-
@janaccnt I'm assuming that you're hitting some kind of CUDA OOM issue, since your GPU card is only 3GB. Are you able to run the original Tensorflow example on your laptop (without using Spark)? Also, if it is using the GPU (and failing due to OOMs), you can try to decrease the batch_size to something smaller, e.g. 10, or you can try installing the CPU-only version of TensorFlow to see if the examples work then. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I try several Spark Deep Learning inference notebooks on Windows. I run Spark in standalone mode with 1 worker with 12 cores (both driver-memory and executor-memory are set to 8G). I always get the same error when applying the deep learning model to the test dataset. For example, in the MNIST image classification notebook, when running this cell:
I always get this error:
I created and configured sparksession as below:
This is the web ui of the master:
This is the web ui of the worker:
My window laptop has 24gb RAM. I've also reduced the model size (just 100k parameters) and the test dataset (only 70 images). I've tried increase the executor.memory to 12g but still got the same error.
The code works perfectly in Google Colab (Spark run in Local mode with 1 core). I've try runing Spark in both Local and Standalone mode in Windows. I've also tried changing the number of core to 1 but still got the same error.
So I guess the error might be related to the Windows OS which I'm using.
Beta Was this translation helpful? Give feedback.
All reactions