Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIDGAN shows different timing performance for different Keras versions #10

Open
mbarbetti opened this issue Jul 2, 2024 · 3 comments
Open
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed python Pull requests that update Python code

Comments

@mbarbetti
Copy link
Owner

mbarbetti commented Jul 2, 2024

Using the latest version of PIDGAN (v0.2.0), we have noticed an unexpected behavior while running GAN trainings with Keras 2 or Keras 3. In particular, taking as a reference scripts/train_GAN_Rich.py for 10 epochs and with a dataset of 300000 instances, we observe a drop of timing performance of about 20% from Keras 2 to Keras 3.

Test machine details: Intel(R) Xeon(R) Gold 6140M CPU @ 2.30GHz (no GPU card equipped)

Launched command:

python train_GAN_Rich.py -p pion -E 10 -C 300_000 -D 2016MU --test

Running on Keras 2.14.0:

[...]
Epoch 10/10
102/102 [==============================] - 4s 41ms/step - g_loss: 1.5768 - d_loss: 0.5838 - accuracy: 0.2408 - bce: 2.3640 - g_lr: 3.9292e-04 - d_lr: 4.9445e-04 - val_g_loss: 1.1388 - val_d_loss: 0.5699 - val_accuracy: 0.2435 - val_bce: 2.3272
[INFO] Model training completed in 0h 00min 46s

while running on Keras 3.3.3:

[...]
Epoch 10/10
102/102 ━━━━━━━━━━━━━━━━━━━━ 5s 50ms/step - accuracy: 0.2978 - bce: 1.7565 - d_loss: 0.5944 - g_loss: 1.5055 - g_lr: 3.9292e-04 - d_lr: 4.9445e-04 - val_accuracy: 0.3049 - val_bce: 1.7069 - val_d_loss: 0.6068 - val_g_loss: 0.9470
[INFO] Model training completed in 0h 00min 55s

passing from 46 seconds for the training on Keras 2 to 55 seconds on Keras 3 (+20% training time).

Repeating the exercise without passing any metrics (metrics=None in compile()) on Keras 2.14.0, we have:

[...]
Epoch 10/10
102/102 [==============================] - 3s 31ms/step - g_loss: 1.5659 - d_loss: 0.5860 - g_lr: 3.9292e-04 - d_lr: 4.9445e-04 - val_g_loss: 1.0507 - val_d_loss: 0.5937
[INFO] Model training completed in 0h 00min 34s

while running without any metrics on Keras 3.3.3:

[...]
Epoch 10/10
102/102 ━━━━━━━━━━━━━━━━━━━━ 4s 36ms/step - d_loss: 0.5683 - g_loss: 1.7626 - g_lr: 3.9292e-04 - d_lr: 4.9445e-04 - val_d_loss: 0.5440 - val_g_loss: 1.3810
[INFO] Model training completed in 0h 00min 40s

passing from 34 seconds for the training on Keras 2 to 40 seconds on Keras 3 (+18% training time).

@mbarbetti mbarbetti added help wanted Extra attention is needed invalid This doesn't seem right labels Jul 2, 2024
@mbarbetti
Copy link
Owner Author

Repeating the exercise by taking scripts/train_ANN_isMuon.py as a reference for 10 epochs and with a dataset of 300000 instances, we observe similar performance. However, this script does not rely on a custom training procedure but it uses a PIDGAN model that is a simple wrap of the Keras Model class. This may exclude PIDGAN from being the source of this issue in favor of Keras 3 itself.

Test machine details: Intel(R) Xeon(R) Gold 6140M CPU @ 2.30GHz (no GPU card equipped)

Launched command:

python train_ANN_isMuon.py -p pion -E 10 -C 300_000 -D 2016MU --test

Running on Keras 2.14.0:

[...]
Epoch 10/10
102/102 [==============================] - 1s 15ms/step - loss: 0.2095 - auc: 0.7671 - lr: 9.5631e-04 - val_loss: 0.2103 - val_auc: 0.7614
[INFO] Model training completed in 0h 00min 16s

while running on Keras 3.3.3:

[...]
Epoch 10/10
102/102 ━━━━━━━━━━━━━━━━━━━━ 2s 16ms/step - auc: 0.7592 - loss: 0.2191 - lr: 9.5631e-04 - val_auc: 0.7655 - val_loss: 0.2188
[INFO] Model training completed in 0h 00min 19s

passing from 16 seconds for the training on Keras 2 to 19 seconds on Keras 3 (+19% training time).

@mbarbetti
Copy link
Owner Author

mbarbetti commented Jul 2, 2024

Even repeating both the previous exercises (with scripts/train_ANN_isMuon.py and scripts/train_GAN_Rich.py) for 10 epochs and with a dataset of 300000 instances after having removed also the learning rate scheduling (callbacks=None in the fit() method), one observes the same drop of timing performance.

Test machine details: Intel(R) Xeon(R) Gold 6140M CPU @ 2.30GHz (no GPU card equipped)


Launched command:

python train_ANN_isMuon.py -p pion -E 10 -C 300_000 -D 2016MU --test

Running on Keras 2.14.0:

[...]
Epoch 10/10
102/102 [==============================] - 1s 13ms/step - loss: 0.2196 - auc: 0.7632 - val_loss: 0.2164 - val_auc: 0.7646
[INFO] Model training completed in 0h 00min 15s

while running on Keras 3.3.3:

[...]
Epoch 10/10
102/102 ━━━━━━━━━━━━━━━━━━━━ 1s 15ms/step - auc: 0.7596 - loss: 0.2201 - val_auc: 0.7678 - val_loss: 0.2163
[INFO] Model training completed in 0h 00min 17s

passing from 15 seconds for the training on Keras 2 to 17 seconds on Keras 3 (+13% training time).


Launched command:

python train_GAN_Rich.py -p pion -E 10 -C 300_000 -D 2016MU --test

Running on Keras 2.14.0:

[...]
Epoch 10/10
102/102 [==============================] - 4s 40ms/step - g_loss: 1.6483 - d_loss: 0.5900 - accuracy: 0.2859 - bce: 2.0635 - val_g_loss: 0.9593 - val_d_loss: 0.6733 - val_accuracy: 0.2897 - val_bce: 2.0299
[INFO] Model training completed in 0h 00min 44s

while running on Keras 3.3.3:

[...]
Epoch 10/10
102/102 ━━━━━━━━━━━━━━━━━━━━ 5s 49ms/step - accuracy: 0.2851 - bce: 1.9503 - d_loss: 0.5958 - g_loss: 1.5373 - val_accuracy: 0.2896 - val_bce: 1.8913 - val_d_loss: 0.6007 - val_g_loss: 1.0399
[INFO] Model training completed in 0h 00min 54s

passing from 44 seconds for the training on Keras 2 to 54 seconds on Keras 3 (+23% training time).

@mbarbetti
Copy link
Owner Author

Following the suggestions of @fchollet, it seems that by playing with jit_compile=True Keras 3 outperforms Keras 2 in timing performance.

source: keras-team/keras#19953 (comment)

@mbarbetti mbarbetti added enhancement New feature or request python Pull requests that update Python code and removed invalid This doesn't seem right labels Jul 5, 2024
@mbarbetti mbarbetti self-assigned this Jul 5, 2024
mbarbetti added a commit that referenced this issue Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed python Pull requests that update Python code
Projects
None yet
Development

No branches or pull requests

1 participant