Adjustments to resuming training #1406

Cauldrath · 2024-07-01T03:51:12Z

Currently skips the resumed epoch if partway through and calculates global steps as only the number of steps into the current epoch.

These changes make it resume mid epoch on the appropriate step, with the right global step count, so max steps will be honored.

This also includes a change to just do a multiplication instead of a for loop over every elapsed epoch.

Currently skips the resumed epoch if partway through These changes make it resume mid epoch on the appropriate step

slashedstar · 2024-07-01T03:58:45Z

This fixed the problems I was having, when resuming from 200, 400 to 1000 steps (when it outputted "epoch is incremented. current_epoch: 0, epoch: 1") it worked as intended, but resuming from the 1200 steps onwards (when it outputted "epoch is incremented. current_epoch: 0, epoch: 2") the training continued after the maximum amount of steps (it also didn't even save a model when reaching the max steps)

kohya-ss · 2024-07-08T11:13:27Z

Thank you for this! Sorry, I didn't test with --max_train_steps option. In my understanding, this fixes the issue when --max_train_steps is specified.

Cauldrath · 2024-07-13T03:11:27Z

Yes, --max_train_steps combined with resuming or setting --initial_steps is the main problem if it isn't starting in the first epoch.

Adjustments to resuming training

e66f94a

Currently skips the resumed epoch if partway through These changes make it resume mid epoch on the appropriate step

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjustments to resuming training #1406

Adjustments to resuming training #1406

Cauldrath commented Jul 1, 2024

slashedstar commented Jul 1, 2024

kohya-ss commented Jul 8, 2024

Cauldrath commented Jul 13, 2024

Adjustments to resuming training #1406

Are you sure you want to change the base?

Adjustments to resuming training #1406

Conversation

Cauldrath commented Jul 1, 2024

slashedstar commented Jul 1, 2024

kohya-ss commented Jul 8, 2024

Cauldrath commented Jul 13, 2024