Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load checkpoint for prediction #294

Open
bnubald opened this issue Aug 30, 2024 · 1 comment
Open

Unable to load checkpoint for prediction #294

bnubald opened this issue Aug 30, 2024 · 1 comment
Labels
bug Something isn't working
Milestone

Comments

@bnubald
Copy link
Collaborator

bnubald commented Aug 30, 2024

  • IceNet version: v0.3.0_dev
  • Python version: 3.11

Description

When attempting to predict against a model trained via v0.3.0_dev, it results in below error. It tries to pick-up a model checkpoint with the name *network_* instead of *model_* which is where the training deposited the checkpoint.

E.g.
Actual model location as output by icenet_train_tensorflow:
./results/networks/unet_train_south/unet_train_south.model_unet_pipeline_south.42/
Location tensorflow is trying to load from for prediction:
./results/networks/unet_train_south/unet_train_south.network_unet_pipeline_south.42/

What I Did

Variables based on icenet-pipeline script.

$./run_train_ensemble.sh \
    -b $BATCH_SIZE -e 10 -f $FILTER_FACTOR -p $PREP_SCRIPT -j 8 \
    ${TRAIN_DATA_NAME}_${HEMI} ${TRAIN_DATA_NAME}_${HEMI} ${FORECAST}_train_${HEMI}

$./run_prediction.sh fc.${FORECAST} ${FORECAST}_train_${HEMI} $HEMI forecast $TRAIN_DATA_NAME 2>&1 | tee logs/fc.${HEMI}.log

[2024-08-30 15:50:34,424 :INFO    ] - Loading model from ./results/networks/unet_train_south/unet_train_south.network_unet_pipeline_south.42...
Traceback (most recent call last):
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet_gan_pipeline/bin/icenet_predict", line 33, in <module>
    sys.exit(load_entry_point('icenet', 'console_scripts', 'icenet_predict')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/hpcdata/users/bryald/git/icenet/icenet/icenet/model/predict.py", line 157, in main
    predict_forecast(
  File "/data/hpcdata/users/bryald/git/icenet/icenet/icenet/model/predict.py", line 62, in predict_forecast
    network = load_model(model_path, compile=False)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet_gan_pipeline/lib/python3.11/site-packages/keras/src/saving/saving_api.py", line 262, in load_model
    return legacy_sm_saving_lib.load_model(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet_gan_pipeline/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet_gan_pipeline/lib/python3.11/site-packages/keras/src/saving/legacy/save.py", line 234, in load_model
    raise IOError(
OSError: No file or directory found at ./results/networks/unet_train_south/unet_train_south.network_unet_pipeline_south.42

Solution

Likely a quick fix, the path that's being searched probably should be model_path instead of network_path.

Current network_path being used here:

network.load_weights(network_path)

Switch to model_path as defined in the training output:

model_path = os.path.join(
network_folder, "{}.model_{}.{}".format(run_name, dataset.identifier,
seed))

@bnubald bnubald added the bug Something isn't working label Aug 30, 2024
@bnubald bnubald added this to the v0.3.0 milestone Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants
@bnubald and others