Checkpoints missing optimizer_states #45

rahm-hopkins · 2022-03-18T18:19:24Z

Thank you for your work on this very useful library!

I have had success training Albert Unbiased from scratch. I'm curious how model performance would compare if training continued from one of your checkpoints (unbiased-albert-c8519128.ckpt in this case). However if I attempt to initiate train.py with this file I am getting an error like:

KeyError: 'Trying to restore training state but checkpoint contains only the model. This is probably due to ModelCheckpoint.save_weights_only being set to True.'

FYI I am using the following command:

python train.py --config configs/Unintended_bias_toxic_comment_classification_Albert_revised_training.json -d 1 --num_workers 0 -e 101 -r model_ckpts/unbiased-albert-c8519128_modified_state_dict.ckpt

Inspecting the checkpoint file I indeed observe it is missing some components, most critical of which (I think) is the optimizer_states. Comparing to one of my own checkpoints it looks like what is absent includes: ['pytorch-lightning_version', 'callbacks', 'optimizer_states', 'lr_schedulers', 'hparams_name', 'hyper_parameters'].

I'm wondering if I am doing something wrong? Or else, is it possible for you to share new versions of your checkpoints that include these missing components?

The text was updated successfully, but these errors were encountered:

laurahanu · 2022-03-23T18:51:46Z

Hello!

Yes, we only saved the weights to keep the files small since the optimizer state is not needed for prediction. If you used the same data and training instructions the full checkpoint should be the same as what you have, which you could check by running the model on the test set.

Hope that helps!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checkpoints missing optimizer_states #45

Checkpoints missing optimizer_states #45

rahm-hopkins commented Mar 18, 2022

laurahanu commented Mar 23, 2022

Checkpoints missing optimizer_states #45

Checkpoints missing optimizer_states #45

Comments

rahm-hopkins commented Mar 18, 2022

laurahanu commented Mar 23, 2022