Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add test for class weights (py_dataset adapter) #20638

Merged
merged 5 commits into from
Dec 17, 2024

Conversation

edge7
Copy link
Contributor

@edge7 edge7 commented Dec 12, 2024

No description provided.

@codecov-commenter
Copy link

codecov-commenter commented Dec 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.54%. Comparing base (4c05e0c) to head (5a12688).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #20638      +/-   ##
==========================================
+ Coverage   82.53%   82.54%   +0.01%     
==========================================
  Files         530      530              
  Lines       49058    49060       +2     
  Branches     7627     7627              
==========================================
+ Hits        40489    40497       +8     
+ Misses       6731     6726       -5     
+ Partials     1838     1837       -1     
Flag Coverage Δ
keras 82.38% <100.00%> (+0.01%) ⬆️
keras-jax 65.74% <100.00%> (+0.01%) ⬆️
keras-numpy 60.62% <100.00%> (+0.01%) ⬆️
keras-tensorflow 66.52% <92.85%> (+0.01%) ⬆️
keras-torch 65.56% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@edge7 edge7 marked this pull request as ready for review December 12, 2024 17:25
@edge7
Copy link
Contributor Author

edge7 commented Dec 12, 2024

Simple PR to add tests for:

TODO: test class_weight

in py_dataset_adapter.PyDataset

Needed to add 2 calls to _standardize_batch:

  • _finite_generator
  • _infinite_generator

as I've noticed that _standardize_batch handles class_weights, but it was just called from get_tf_dataset. Hope there are no side effect, but tests are fine.

Copy link
Collaborator

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@edge7 edge7 force-pushed the test_class_weight_pydataset branch from 062eda8 to b396c7a Compare December 12, 2024 18:41
@edge7
Copy link
Contributor Author

edge7 commented Dec 12, 2024

Added a new commit where _standardize_batch is also called from:

  • _finite_enqueuer_generator
  • _infinite_enqueuer_generator

as those are probably even more used! Tested with real training, no crashes or anything like that

@edge7 edge7 requested a review from fchollet December 12, 2024 18:45
@edge7 edge7 force-pushed the test_class_weight_pydataset branch from b396c7a to 4628ddf Compare December 12, 2024 19:08
@edge7 edge7 force-pushed the test_class_weight_pydataset branch from 4628ddf to 7061dbe Compare December 13, 2024 06:57
@edge7
Copy link
Contributor Author

edge7 commented Dec 13, 2024

The PR turned out to be bigger than expected, as I initially thought it only required adding a test. However, I gradually noticed that class_weight was not actually being used in the case of PyDataset.
One note: astype is not supported with PyTorch tensors, so I had to adjust that case.

@edge7 edge7 force-pushed the test_class_weight_pydataset branch from 2df9589 to 4a98688 Compare December 13, 2024 10:49
@@ -121,7 +121,15 @@ def class_weight_to_sample_weights(y, class_weight):
y = np.argmax(y, axis=-1)
else:
y = np.squeeze(y, axis=-1)
y = np.round(y).astype("int32")
y = np.round(y)
if hasattr(y, "astype"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

y = np.round(y) should have converted the tensor to a numpy tensor, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not at the laptop right now to paste something to reproduce it, but it was failing for me. np.round was fine (but still returning a torch tensor)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't get this -- if you apply any np op to an input that isn't numpy (like a torch tensor) it will call __array__ on the object to turn it into a NumPy ndarray. You should never have a torch tensor past that point

Copy link
Contributor Author

@edge7 edge7 Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that isn't very clear.

Given this example:

import torch
import numpy as np
a = torch.from_numpy(np.array([1, 2.5, 3]))
b = np.round(a)

placing a breakpoint at the last line:

I get here:

image
which is here
it tries to apply the bound but:

image

so we enter _wrapit
which is here

Please note that following the debugging flow into:

conv = _array_converter(obj)
bring me here, i.e.: we are in the torch domain now.
image

then when back to _wrapit
the result then is indeed a numpy array:

image

the last call brings me to the following:

image
so I get back a Torch tensor again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip list | grep torch
torch                      2.5.1+cpu
torchvision                0.20.1+cpu

[notice] A new release of pip is available: 24.0 -> 24.3.1
[notice] To update, run: pip install --upgrade pip
❯ pip list | grep numpy
numpy                      2.0.2

Copy link
Contributor Author

@edge7 edge7 Dec 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, am trying to get a better understanding and it looks like that array_wrap is indeed a way to get back the original object after the numpy conversion:

https://numpy.org/doc/2.1/user/basics.interoperability.html#returning-foreign-objects

Which seems to be exactly what torch has implemented.
If you scroll a bit the above link you will see that this torch behaviour is clearly documented with a couple of examples

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this is so messed up. Then we should use ops.convert_to_numpy at the start of the series of calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean at the start of this function or even before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

committed 'convert_to_numpy'

@edge7 edge7 requested a review from fchollet December 17, 2024 08:56
Copy link
Collaborator

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@google-ml-butler google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Dec 17, 2024
@fchollet fchollet merged commit 2b6c800 into keras-team:master Dec 17, 2024
6 checks passed
@google-ml-butler google-ml-butler bot removed awaiting review ready to pull Ready to be merged into the codebase kokoro:force-run labels Dec 17, 2024
@edge7 edge7 deleted the test_class_weight_pydataset branch December 17, 2024 19:34
@edge7
Copy link
Contributor Author

edge7 commented Dec 17, 2024

Thanks for merging @fchollet
Do you remember what you meant here

# TODO: test inference mode (single output)

?

Just passing batch equal to 1 or something else?

shashaka pushed a commit to shashaka/keras that referenced this pull request Dec 20, 2024
* test: add test for class weights (py_dataset adapter)

* "call _standardize_batch from enqueuer"

m

* add more tests, handle pytorch astype issue

m

* convert to numpy to ensure consistent handling of operations
fchollet pushed a commit that referenced this pull request Dec 22, 2024
* Add implementations for random_saturation

* change parse_factor method to inner method.

* Add implementations for random_color_jitter

* Fix Randomhue (#20652)

* Small fix in random hue

* use self.backend for seed

* test: add test for class weights (py_dataset adapter) (#20638)

* test: add test for class weights (py_dataset adapter)

* "call _standardize_batch from enqueuer"

m

* add more tests, handle pytorch astype issue

m

* convert to numpy to ensure consistent handling of operations

* Fix paths for pytest in contribution guide (#20655)

* Add preliminary support of OpenVINO as Keras 3 backend (#19727)

* [POC][OV] Support OpenVINO as Keras 3 backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark all unsupported ops from numpy space

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark unsupported ops in core, image, and linalg spaces

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark unsupported ops in math, nn, random, and rnn spaces

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorting imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorting imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorting imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix inference

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove openvino specific code in common part

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix typo

* Clean-up code

Signed-off-by: Kazantsev, Roman <[email protected]>

* Recover imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Sort imports properly

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format source code

Signed-off-by: Kazantsev, Roman <[email protected]>

* Format the rest of source code

Signed-off-by: Kazantsev, Roman <[email protected]>

* Continue format adjustment

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add OpenVINO dependency

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix inference using OV backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Support bert_base_en_uncased and mobilenet_v3_small from Keras Hub

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove extra openvino specific code from layer.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Apply code-style formatting

Signed-off-by: Kazantsev, Roman <[email protected]>

* Apply code-style formatting

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix remained code-style issue

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run tests for OpenVINO backend in GHA

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add config file for openvino backend validation

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add import test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix error in import_test.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add import_test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add openvino specific integration tests in GHA

Signed-off-by: Kazantsev, Roman <[email protected]>

* Exclude coverage for OpenVINO

Signed-off-by: Kazantsev, Roman <[email protected]>

* remove coverage for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Try layer tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run layer tests for openvino backend selectively

Signed-off-by: Kazantsev, Roman <[email protected]>

* Mark enabled tests for openvino backend in a different way

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update .github/workflows/actions.yml

* Fix import for BackendVariable

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix errors in layer tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add test for Elu via openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix sorted imports

Signed-off-by: Kazantsev, Roman <[email protected]>

* Extend testing for attention

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update keras/src/layers/attention/attention_test.py

* Switch on activation tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on attention tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update keras/src/layers/attention/additive_attention_test.py

* Update keras/src/layers/attention/grouped_query_attention_test.py

* Run conv tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix convolution in openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Work around constant creation for tuple

Signed-off-by: Kazantsev, Roman <[email protected]>

* Work around constant creation in reshape

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run depthwise conv tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix get_ov_output for other x types

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix elu translation

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix softmax and log_softmax for None axis

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run nn tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix numpy operations for axis to be None

Signed-off-by: Kazantsev, Roman <[email protected]>

* Run operation_test for openvino_backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on math_test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on image tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on linalg test for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Extend OpenVINOKerasTensor with new built-in methods and fix shape op

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on core tests for openvino backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Use different way of OpenVINO model creation that supports call method

Signed-off-by: Kazantsev, Roman <[email protected]>

* Unify integration test for openvino

Signed-off-by: Kazantsev, Roman <[email protected]>

* Support new operations abs, mod, etc.

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add support for more operations like squeeze, max

Signed-off-by: Kazantsev, Roman <[email protected]>

* Try to use excluded test files list

Signed-off-by: Kazantsev, Roman <[email protected]>

* Apply formatting for normalization_test.py

Signed-off-by: Kazantsev, Roman <[email protected]>

* Correct GHA yml file

Signed-off-by: Kazantsev, Roman <[email protected]>

* Test that openvino backend is used

Signed-off-by: Kazantsev, Roman <[email protected]>

* Revert testing change in excluded test files list

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include testing group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include legacy test group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Exclude legacy group of tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include initializers tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Skip tests for initializers group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove export test group from ignore

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include dtype_policies test group

Signed-off-by: Kazantsev, Roman <[email protected]>

* Reduce ignored tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix ops.cast

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add decorator for custom_gradient

Signed-off-by: Kazantsev, Roman <[email protected]>

* Shorten line in custom_gradient

Signed-off-by: Kazantsev, Roman <[email protected]>

* Ignore dtype_policy_map test

Signed-off-by: Kazantsev, Roman <[email protected]>

* Include callback tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on backend tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Exclude failing tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Correct paths to excluded tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on some layers tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Remove pytest.mark.openvino_backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Register mark requires_trainable_backend

Signed-off-by: Kazantsev, Roman <[email protected]>

* Ignore test files in a different way

Signed-off-by: Kazantsev, Roman <[email protected]>

* Try different way to ignore test files

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix GHA yml

Signed-off-by: Kazantsev, Roman <[email protected]>

* Support tuple axis for logsumexp

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on some ops tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Switch on some callbacks tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add openvino export

Signed-off-by: Kazantsev, Roman <[email protected]>

* Update sklearn tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add a comment to skipp numerical_test

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add custom requirements file for OpenVINO

Signed-off-by: Kazantsev, Roman <[email protected]>

* Add reqs of openvino installation for api changes check

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix types of Variables and switch on some variables tests

Signed-off-by: Kazantsev, Roman <[email protected]>

* Fix nightly code check

Signed-off-by: Kazantsev, Roman <[email protected]>

---------

Signed-off-by: Kazantsev, Roman <[email protected]>

* Make sklearn dependency optional (#20657)

* Add a condition to verify training status during image processing (#20650)

* Add a condition to verify training status during image processing

* resolve merge conflict

* fix transform_bounding_boxes logic

* add transform_bounding_boxes test

* Fix recurrent dropout for GRU. (#20656)

The simplified implementation, which used the same recurrent dropout masks for all the previous states didn't work and caused the training to not converge with large enough recurrent dropout values.

This new implementation is now the same as Keras 2. Note that recurrent dropout requires "implementation 1" to be turned on.

Fixes #20276

* Fix example title in probabilistic_metrics.py (#20662)

* Change recurrent dropout implementation for LSTM. (#20663)

This change is to make the implementation of recurrent dropout consistent with GRU (changed as of #20656 ) and Keras 2.

Also fixed a bug where the GRU fix would break when using CUDNN with a dropout and no recurrent dropout. The solution is to create multiple masks only when needed (implementation == 1).

Added coverage for the case when dropout is set and recurrent dropout is not set.

* Never pass enable_xla=False or native_serialization=False in tests (#20664)

These are invalid options in the latest version of jax2tf, they
will just immediately throw.

* Fix `PyDatasetAdapterTest::test_class_weight` test with Torch on GPU. (#20665)

The test was failing because arrays on device and on cpu were compared.

* Fix up torch GPU failing test for mix up (#20666)

We need to make sure to use get any tensors places on cpu before using
them in the tensorflow backend during preprocessing.

* Add random_color_jitter processing layer

* Add random_color_jitter test

* Update test cases

* Correct failed test case

* Correct failed test case

* Correct failed test case

---------

Signed-off-by: Kazantsev, Roman <[email protected]>
Co-authored-by: IMvision12 <[email protected]>
Co-authored-by: Enrico <[email protected]>
Co-authored-by: Marco <[email protected]>
Co-authored-by: Roman Kazantsev <[email protected]>
Co-authored-by: Matt Watson <[email protected]>
Co-authored-by: hertschuh <[email protected]>
Co-authored-by: Jasmine Dhantule <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants