Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use predict_generator to better utilize GPU #32

Open
bertsky opened this issue Jun 10, 2021 · 3 comments
Open

use predict_generator to better utilize GPU #32

bertsky opened this issue Jun 10, 2021 · 3 comments
Labels
enhancement New feature or request

Comments

@bertsky
Copy link
Contributor

bertsky commented Jun 10, 2021

When the model is applied in patch mode (the default), a loop over the windows is run (on CPU / in Numpy) and passed to model.predict() as a single image each (on GPU / in Keras).

label_p_pred = model.predict(img_patch.reshape(1, img_patch.shape[0], img_patch.shape[1], img_patch.shape[2]))

This does not utilize the GPU for two reasons:

  1. the effective batch size of 1 might be too low for the number of shaders and size of GPURAM
  2. the GPU kernel can only run briefly and has to wait for the CPU each time (patch cropping and memory paging)

I suggest changing the following:

  • Define a generator function doing the patching/cropping. It should be a thread-safe formulation, e.g. a keras.utils.Sequence.
  • Pass that to predict_generator instead of predict to get concurrent CPU / GPU computation.
  • Allow parameterizing the number of workers and batch size to allow optimal adaptation to the concrete hardware and crop/model sizes.
@bertsky
Copy link
Contributor Author

bertsky commented Jun 20, 2021

Spoiler: I know how to do this. Would you care for a PR?

@vahidrezanezhad
Copy link
Member

Spoiler: I know how to do this. Would you care for a PR?

@bertsky I appreciate it if you do that :)

@cneud cneud added the enhancement New feature or request label Apr 19, 2022
@apacha
Copy link
Contributor

apacha commented Aug 24, 2022

@bertsky did you ever complete this improvement? Maybe on a fork? I would like to run this binarization on a large dataset and with the current procedure it is simply too slow (10-20 images per minute).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants